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Abstract. A theorem of Leibman [35] asserts that a polynomial orbit (g(n)T) ne z 
on a nilmanifold G/T is always equidistributed in a union of closed sub-nilmanifolds 
of G/T. In this paper we give a quantitative version of Leibman's result, describ- 
ing the uniform distribution properties of a finite polynomial orbit (g(n)T) ne [N] in a 
nilmanifold. More specifically we show that there is a factorization g = eg'"/, where 
e(n) is "smooth", {^{n)T) n ^z is periodic and "rational", and (g' (n)T) ne p is uniformly 
distributed (up to a specified error 5) inside some subnilmanifold G'/F' of G/T for all 
sufficiently dense arithmetic progressions P C [N]. 

Our bounds are uniform in N and are polynomial in the error tolerance 5. In a com- 
panion paper (T3] we shall use this theorem to establish the Mobius and Nilsequences 
conjecture from our earlier paper |12) . 



1. Introduction 

Nilmanifolds. In the last few years it has come to be appreciated that nilmani- 
folds, together with orbits on them, play a fundamental role in combinatorial number 
theory. Their relevance was certainly apparent in [8], and it has been displayed quite 
dramatically in recent ergodic-theoretic work of Host-Kra [16] and Ziegler [35] . More 
recently the authors have explored how nilmanifolds arise in additive combinatorics [10] 
and in the study of linear equations in the primes [12]. The present paper is a part of 
that programme (and in particular will be used to prove the Mobius and Nilsequences 
conjecture from [T5] in the companion [T3] to this paper) but, since it concerns only 
the intrinsic properties of nilmanifolds, may be read independently of any of the other 
work. The reader interested in the background may consult the surveys [HI [EHJ ED] or 
the paper [T2] . 

We begin by setting out our notation for nilmanifolds. 

Definition 1.1 (Filtrations and Nilmanifolds). Let G be a connected, simply connected 
Lie group with identity element id^. For the purposes of this paper we define a filtration 
G, on G to be a sequence of closed connected subgroups 

G = G = G 1 DG 2 D---DG d D G d+l = {id G } 

which has the property that [Gi,Gj] C G i+ j for all integers i,j ^ 0. The least integer 
d for which Gd+i = {^g} is called the degree of the filtration G, and here, as usual, 
the commutator group [H, K] is the group generated by {[h, k] : h £ H, k G K}, where 
[h, k] := hkh~ x k~ l is the commutator of h and k. If G possesses a filtration then we 
say that G is nilpotent. Let r C G be a uniform subgroup (i.e. a discrete, cocompact 
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subgroup). Then the quotient G/T = {gT : g G G} is called a nilmanifold. We also 
write g(mod V) for gT. 

Throughout the paper we will write m = dimG and rrii = dimGi, % = 1, . . . , d. 

Remark. The assumptions of connectedness and simple-connectedness for G are not 
completely standard, but are very convenient for us. In any situation in which we 
apply our theorems, we expect to be able to reduce to this case. If a filtration G, of 
degree d exists then it is easy to see that the lower central series filtration^ defined by 
G = Go = Gi, Gi+i = [G,Gi] terminates with G s+ i = {idc} for some integer s ^ d. 
We call the minimal such integer s the step of the nilpotent Lie group G. In this paper 
the degree d will play a vastly more important role than the step s, since it will be 
important to work with nitrations more general than the lower central series. 

Examples. The simplest examples of nilmanifolds arise when s = 1 in which case 
we may, after a linear transformation, take G = lR m and T = Z m . The lower central 
series filtration is given by G = Go = G\ and G 2 = {id^}. The nilmanifold G/T is 
then referred to as a torus. Note that in this example the group operation is written 
additively, as is conventional for abelian groups. When we are working with non-abelian 
groups we shall write the group operation multiplicatively. The simplest non-abelian 
example is given by the 3-dimensional Heisenberg nilmanifold, in which s — 2. We will 
study this object in some detail later on. Here we take 

G=(h R it) and r=fS!iV (1.1) 
Vooi/ \o o i / 

The lower central series filtration is given by G = Go = G\, 

r< / 1 R\ 

G 2 = o i o 
\o o \) 

and Gs = {idc}- Observe that a fundamental domain for the action of F on G is 

1 XI X2 \ ~) 

o i xs ) : < xi,x 2 ,x 3 < 1 > . (1.2) 

1/ J 

Thus one can view G/T as a unit cube, with the sides glued together in a twisted 
fashion. o 

This paper will be concerned with the qualitative and quantitative equidistribution 
of various algebraic sequences on nilmanifolds. We first set out our notation for equidis- 
tribution. 

Definition 1.2 (Equidistribution). Let G/T be a nilmanifold. Here and in the sequel 
we endow G/T with the unique normalised Haar measure, we let [N] := {n £ Z : 1 ^ 
n ^ iV}, and we write K ae Af(a) := r^r J2aeA /(A) f° r ^ ne avera g e of / on the set A. 

(i) An infinite sequence (g(n)r) ng N in G/T is said to be equidistributed if we have 

lim E ne[N] F{g{n)T) = f F 

N^oo J G/T 

for all continuous functions F : G/T — > C. 



1 It is not hard to see that the lower central scries filtration is a filtration, in that we have [Gi, Gj] C 
Gi+j for all 
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(ii) An infinite sequence (g(n)T) neZ in G/T is said to be totally equidistributed if 
the sequences (g(an + r)T) neN are equidistributed for all a G Z\{0} and r G Z. 

(iii) Given a length N > and an error tolerance 5 > 0, a finite sequence (g(n)T) ne [ N ^ 
is said to be 5 -equidistributed if we have 



E n£[N] F(g(n)T) - \ F 
G/r 



<S\\F\ 



Lip 



for all Lipschitz functions F : G/T — > C, where 

\F(x)-F(y)\ 
dG/v{x,y) 



I Lip := p ||oo + sup 



x,yeG/T,x^y a G/T 



and the metric G?G/r on G/T will be defined in Definition 12.21 in the next section 
(it will involve choosing a Mal'cev basis X for G/T). 
(iv) A finite sequence (g(n)T) ne \m is said to be totally 5 -equidistributed if we have 



E neP F(g(n)T) - [ F 

JG/V 



^ S\\F\ 



Lip 



I G/T 

for all Lipschitz functions F : G/T — > C and all arithmetic progressions P C [AT] 
of length at least 5N. 

We will be interested in the qualitative question of when a sequence (g(n)T) ne ^ is 
equidistributed (or totally equidistributed), as well as the more quantitative question of 
when a finite sequence (g(n)T) ne [^] is 5-equidistributed (or totally 5-equidistributed). 
Such questions, and corresponding questions in more general settings (for example when 
G/T is a homogeneous space of a general, not necessarily nilpotent, Lie group) play a 
fundamental role in number theory; see [M] for a discussion. These questions are also 
closely related to the celebrated theorem of Ratner [28] on unipotent flows, although as 
we are restricting attention to nilmanifolds, we will not need the full force of Ratner's 
theorem (or quantitative versions thereof) here. 

Qualitative equidistribution theory of linear sequences. To begin the 
discussion let us first restrict attention to linear sequences. 

Definition 1.3 (Linear sequences). A linear sequence in a group G is any sequence 
g : Z — > G of the form g(n) := a n x for some a, x G G. A linear sequence in a nilmanifold 
G/T is a sequence of the form (g(n)T) ne %, where g : Z — > G is a linear sequence in G. 

In the additive case G = M m , T = Z m , a linear sequence takes the form (an + 
x(mod Z m ))„ eZ . In this case one can understand equidistribution satisfactorily using 
Kronecker's theorem and its variants. For instance, to answer qualitative questions 
about equidistribution in this case, we have the following classical result. 

Theorem 1.4 (Qualitative Kronecker theorem). Letm ^ 1, and let (g(n) (mod Z m )) nG ^ 
be a linear sequence in the torus IR m /Z m . Then exactly one of the following statements 
is true. 

(i) (g(n) (mod Z m )) ng N is equidistributed in R m /Z m . 

(ii) There exists a non-trivial character n : M m — > M/Z, i.e. a continuous additive 
homomorphism which annihilates Z m but does not vanish entirely, such thatrjog 
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is constant. (Equivalently, if g(n) = an + x, there exists a non-zero k e Z m 
such that k ■ a G Z.) 

In particular, (g(n)(mod Z m )) ng ^ is equidistributed if and only if it is totally equidis- 
tributed. 

Remarks. An equivalent formulation of this theorem is that if the linear sequence 



is not equidistributed, then this sequence instead takes values in a finite union of proper 
subtori of G/Y. This can be viewed as an extremely simple special case of the theorems 
of Ratner [28] and Shah [29]. More quantitative results can be obtained via Fourier 
analyst; see Proposition 13. II below. 

A remarkable theorem of Leon Green allows one to reduce qualitative questions about 
the distribution of orbits on nilmanifolds of step s > 1 to the abelian case just described. 

Definition 1.5 (Horizontal torus). Given a nilmanifold G/Y, the horizontal torus is 
defined to be (G/Y)^ := G/[G,G]Y. We let n : G — > (G/T) a b be the canonical projec- 
tion map. A horizontal character is a continuous additive homomorphism 77 : G — > R/Z 
which annihilates Y; observe that such characters in fact annihilate [G, G]Y and so can 
be viewed as characters on the horizontal torus. We say that a horizontal character is 
non-trivial if it is not identically zero. 

It follows from results of Mal'cev [25], and in particular the existence of so-called 
Mal'cev bases, that (G/r) a b really is a torus and in fact is isomorphic to R mab /Z mab 
where m a b := dini^G) — dim]g([G, G}). We will not actually need this characterisation, 
as the properties of horizontal characters 77 : G — > R/Z will be our main focus. Readers 
may find it useful to keep this in mind, however. 

Theorem 1.6 (Leon Green's theorem). Let (g(n)Y) ne i be a linear sequence in a nil- 
manifold G/Y. Then the orbit {g{n)Y) n£ fq is equidistributed in G/Y if and only if the 
projected orbit (7r(g(?2)r))„ g N is equidistributed in the horizontal torus (G/Y)^. (In 
particular, (g(n)Y) ne i is equidistributed if and only if it is totally equidistributed.) 

Proof. See [H [TJ]. Leon Green used representation theory to establish his result, 
but a more elementary proof was subsequently found by Parry |26j . □ 

Example. Suppose that G/Y is the Heisenberg example (II. ip . Then 



(<7(n)(mod Z m )) neN 




and (G/r)ab may be identified with R 2 /Z 2 , the projection ir being given by 




Leon Green's theorem implies that the orbit (a n r) nS N, where 




In this simple setting one could also use more classical tools such as Minkowski's geometry of 
numbers, and in the m = 1 case one could even use continued fractions. However, these methods do 
not seem to extend easily to higher steps. 
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is equidistributed in G/Y if and only if 1, a\ and 0:3 are independent over Q. It is already 
somewhat nontrivial to establish this result directly. o 

By Kronecker's theorem, we can then recast Theorem 11.61 in the following equivalent 
formulation: 

Theorem 1.7 (Leon Green's theorem, again). Let (g(n)Y) n< z% be a linear sequence in 
a nilmanifold G/Y. Then exactly one of the following statements is true: 

(i) (g(n)Y) ne fq is equidistributed in G/Y. 

(ii) There exists a non-trivial horizontal character 77 : G — > R/Z such that rj o g is 
constant. 

Qualitative equidistribution theory of polynomial sequences. While 
our primary applications are concerned with linear sequences, it turns out for various 
technical reasons that it is important to work in the more general class of polynomial 
sequences. 

Definition 1.8 (Polynomial sequences in nilpotent groups). Suppose that G is a nilpo- 
tent group with a filtration G,. Let g : Z — > G be a sequence. If h G Z we write 
dh9 '■= g{ nJ r h)g(n)~ x . We say that g is a polynomial sequence with coefficients in G,, 
and write g G poly(Z, G m ), if ■ ■ ■ d^g takes values in Gi for all positive integers % and 
for all choices of /ti, . . . , /ij G Z. In this case we say that g has degree d. If g lies in 
poly(G.) for some filtration G, then we simply say that g is a polynomial sequence. 

This definition is a little abstract. However we will show in £0 that g : Z — > G 
is a polynomial sequence if and only if g has the form g(n) = a^ 1 ^ . . . aj^ , where 
ai, . . . ,a,k G G and the Pi : N — > N are polynomials. In particular a linear sequence 
g(n) = a n x is a polynomial sequence, and in fact since d^gin) = a hl and dj Kl df il g{n) = 
idc it is clear that such a sequence has coefficients in the lower central series filtration 
G.. Note carefully that the degree of a linear sequence is equal to the step s of the 
underlying Lie group G, and is not equal to one as the name "linear" might suggest. 

A remarkable result of Lazard and Leibman [TjJl [201 EI] asserts that poly(Z, G,) is a 
group. We will prove this in §H1 and it will play a key role in several of our arguments. 

Theorem 11.61 was extended by Liebman [22] to the case when g(n) is a polynomial 
sequence rather than a linear one. In particular, he showed the following generalisation 
of Theorem 11.71 

Theorem 1.9 (Leibman's theorem). |22j Suppose that G/Y is a nilmanifold. and that 
g : Z G is a polynomial sequence. Then exactly one of the following statements is 
true: 

(i) (g(n)r) ngN is equidistributed in G/Y. 

(ii) There exists a non-trivial horizontal character 77 : G — > R/Z such that rj o g is 
constant. 

Remark. This theorem significantly generalizes the classical theorem of Weyl that 
a polynomial sequence in R/Z is equidistributed unless all of its non-constant coeffi- 
cients are rational. We will in fact use a quantitative version of Weyl's theorem in our 
arguments; see Proposition 14.31 below. 
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We can iterate this theorem to establish a factorization result. We first need some 
notation. 

Definition 1.10 (Rational subgroup). Let G/Y be a nilmanifold. A rational subgroup 
of G is a closed connected subgroup G' of G such that G'Y/Y = G'/T' = G'/(G' n Y) is 
a closed submanifold of G/Y (or equivalently, that Y' is a cocompact subgroup of G'). 
We say that G' is proper if G' 7^ G. 

Example. If G/r is a nilmanifold (that is to say if there exists a uniform subgroup 
r ^ G) one can show that each member Gi of the lower central series is a rational 
subgroup; see e.g. jl] or [25]. o 

Definition 1.11 (Rational sequence). Let G/Y be a nilmanifold. A rational group 
element is any g e G such that g r e T for some integer r > 0. A rational point is any 
point in G/Y of the form 5T for some rational group element g. A sequence (g(n)Y) n€ z 
is rational if every element <?(n)r in the sequence is a rational point. 

Remark. It is not difficult to show that the rational group elements form a dense 
subgroup of G that contains T; see Lemma IA.11I We will show in Lemma [A. 121 that 
any polynomial sequence in G/Y which is rational is automatically periodic. 

Corollary 1.12 (Factorization theorem for polynomial sequences). Let (g(n)Y) nE z be a 
polynomial sequence in a nilmanifold G/Y. Then there exists a rational subgroup G' of 
G and a factorization g = eg'j, where e G G is a constant, g' : Z — > G' is a polynomial 
sequence such that (g'(n)Y') ne ^ is totally equidistributed in G'/Y' (where Y' :— G D Y), 
and 7 : Z — > G is a polynomial sequence such that the sequence (7(n)r) ng pj is rational 
(and hence, by Lemma {A.12\ (i). is periodic). 

Proof. We give a sketch of this argument only; we will repeat this argument in more 
detail when proving Theorem 11.191 below. 

We induct on the dimension m of G/Y, assuming that the claim has already been 
proven for all nilmanifolds of lesser dimension. By replacing g(n) with g(0) _1 g(n) if nec- 
essary (absorbing the g(0) factor into the e term) we may normalise so that g(0) = idc- 
If (g(n)Y) n& z is equidistributed on G/Y, then it is totally equidistributed by Leibman's 
theorem, and we are done (with g' — g, G' — G, and £,7 trivial). So we may assume 
that (g(n)Y) n( zz is not equidistributed. By Leibman's theorem, there exists a non-trivial 
horizontal character 77 : G — > R/Z such that nog is constant, in fact by our normalisation 
g(0) = idc we must have 77 o g = 0, thus g takes values in ker(r/). It is then not difficult 
to factorise g = go7o ; where 70 is a polynomial sequence with ( , yo(n)Y) ne z rational and 
periodic, and go is a polynomial sequence taking values in the proper rational subgroup 
G' ^ G, defined to be the connected component of ker(?7) which contains the origin. The 
claim then follows by applying the induction hypothesis to the sequence (go(n)Y') ne z in 
the nilmanifold G'/Y', which has dimension m — 1, and using the fact that the product 
of two rational group elements is again rational, as well as the trivial observation that 
rational group elements of G' are automatically rational group elements of G also. □ 

Remark. In words, this corollary asserts that in the qualitative setting, one can 
decompose 

(arbitrary polynomial sequence) = (constant) x (totally equidistributed) x (periodic). 
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An inspection of the proof reveals that one can in fact take the constant e to be g{0). 

As a corollary we obtain a Ratner-Shah type theorem for polynomial sequences in 
nilmanifolds, first established by Leibman [22J: 

Corollary 1.13 (Leibman's Ratner-Shah type theorem for nilmanifolds). Let (g(n)Y) ne z 
be a polynomial sequence in a nilmanifold G /Y . Then there exists a rational subgroup G' 
ofG, a group elements G G, and a rational periodic sequence (x n ) ne z inG/Y with some 
period q such that for every r G Z, the sequence (g(qn + r)Y) ne z is totally equidistributed 
in eG'x r . 

Remark. Shah [29J obtained a similar result for arbitrary discrete unipotent (but 
linear) flows on a finite volume homogeneous space; the case of continuous unipotent 
linear flows was treated earlier by Ratner [28] (see [5] for further discussion). Leibman's 
proof of Corollary 11.131 does not use these results, but instead proceeds in two stages. 
Firstly, by iterating Theorem 11.61 (or more precisely a generalization of this theorem 
to the case when G is not necessarily connected), a version of Corollary 11.131 for linear 
sequences is obtained. Secondly, by utilising a lifting trick of Furstenberg [7, p. 31], 
the polynomial case is deduced from the linear case. As we shall discuss shortly, these 
arguments do not work well in the quantitative case, and one must instead grapple with 
polynomial sequences directly. 

Quantitative equidistribution results. This paper stems from an attempt to 
establish quantitative versions of the above theorems for finite orbits. Unfortunately, 
the need for quantitative bounds on all aspects of these results forces us to introduce a 
substantial amount of new notation. 

Definition 1.14 (Asymptotic notation). We use Y = O(X) or F <C X to denote the 
estimate \Y\ ^ CX some absolute constant C. When we need to indicate dependence of 
C on various parameters, we shall indicate this by subscripts, thus for instance Od t m(X) 
denotes a quantity bounded in magnitude by C^ m X for some C d m depending only on 
the quantities d, m. 

Definition 1.15 (Circle norm). If x G R./Z, we use ||:e||r/z := dist(x,Z) to denote the 
distance of x to the origin (thus ||a(mod Z)|| R / Z = \a\ whenever —1/2 < a ^ 1/2). If 
x G K., we write ||o;||r/z for ||x(mod Z)|| K / Z . 

Our first main result is the following quantitative version of Theorem 11.91 Note that 
some of the terminology in this theorem will not be formally introduced until the next 
section, but this should not prevent the reader from gaining a rough appreciation of the 
statement. 

Theorem 1.16 (Quantitative Leibman theorem). Let m, d ^ 0, < 5 < 1/2, and N ^ 
1. LetG/Y be an m- dimensional nilmanifold together with a filtration G, of degree d and 
a ^-rational Mal'cev basis X adapted to this filtration. Suppose that g G poly(Z, G,). 
Then at least one of the following statements is true: 

(i) (g(n)Y) n£ [N] is 5 -equidistributed in G/Y. 

(ii) There exists a non-trivial horizontal character i] : G — > R/Z with \t)\ <C S^° m - d ^ 
such that \\n o g(n) — rj o g(n — 1)||r/z 5~° m ' d ^ /N for all n G [N). 
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Remarks. The notions of a "^-rational Mal'cev basis adapted to G." , of the modulus 
1 77 1 of a horizontal character and of the metric which is implicit in the notion of 8- 
equidistribution are technical and will be defined precisely in Definition I2.4[ Definition 
12. 6| and Definition 12.21 respectively. 

Theorem 1 1 . 1 6 1 assert s that the sequence (g{n)T) ne [j^ is either 5-equidistributed up to 
time N, or else it is very far from being equidistributed up to time 5 0m - d ^N, being 
concentrated very close to a union of 5~ 0m - d ^ subtori. One should view N as being 
very large compared to 1/5, otherwise the content of the proposition is trivial. It is 
not hard to deduce Theorem 11.91 from Theorem I1.16| we leave this to the reader as an 
exercise. 

For technical reasons it will be convenient later to strengthen the statement (ii) 
slightly, so as to also control higher "derivatives" d^(r}og); see the next section for more 
information. 

Whereas in the qualitative setting one always works in the limit N — > 00, in the 
quantitative setting one works with a fixed (but large) N. As N increases, there can 
be transitions in the behaviour of the finite sequence (g(n)T) n ^ N ], in which the equidis- 
tribution (or lack thereof) changes significantly (cf. the "coalescence of progressions" 
phenomenon j32j Chapter 12]); these transitions are a new feature of the quantitative 
setting, which are not readily visible in the qualitative one. We illustrate this with a 
simple example: 

Example. Consider the (additive) example G = R, T = Z and g(n) = (| + cr)n, where 
< cr ^ is a parameter. In this case we have m = d = 1. If N is much larger than 
1/er, we see that (g(n)(mod Z)) ng [jv] is 5-equidistributed. On the other hand, if N is 
much smaller than 1/er, we see that (g(n) (mod Z)) n6 [jv] fails to be 5-equidistributed, 
indeed it is highly concentrated around and 1/2 in this case. However, if we let r] : 
G — > K/Z be the non-trivial horizontal character rj(x) := 2x(mod Z) we see that r](g(n)) 
is slowly varying in the sense of (ii). The transitional regime when N is comparable 
to 1/er is interesting; there is enough irregularity to prevent 5-equidistribution on the 
sequence (g(n) (mod Z)) n€ pv], but in order to obtain near-constancy of fj(g(n)) one in 
fact has to pass to shorter sequences such as (g(n) (mod Z)) n6 r a ioojvi- The need to work 
on a variety of different scales like this is very much a feature of additive combinatorics, 
particularly those parts of it that have the flavour of "quantitative ergodic theory" . The 
work of Bourgain [3] on Roth's theorem is another example. o 

Of course, by specialising to linear sequences, Theorem 11.161 also implies a quantita- 
tive version of Leon Green's theorem. The proof of Theorem 11.161 could be simplified 
somewhat in this case. Such a theorem is not especially useful, however. The following 
example may help to illustrate why, in the quantitative setting, the consideration of 
linear sequences leads naturally to the "polynomial" world. 

Example. (The skew torus) Let us consider the Heisenberg example ( II -ip once more, 
taking now 




where a := N 3 / 2 . Set 
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Translating to the fundamental domain, we obtain 

g(n)T = 



(Here, and for the rest of the paper, 
greatest integer less than or equal to x. 



1 {2na} {-n 2 a} 
1 
1 

we define {x} := x — [x\, where [^J is the 
The orbit {g{w)T) n& \m is certainly not close to 
equidistributed in G/T, and indeed the projected orbit (ii(g(n)T)) n€ ^ stays very close 
to the trivial subtorus T C R 2 /Z 2 which consists simply of the point {(0, 0)}. 

Now 7r _1 (T) is of course isomorphic to a one-dimensional torus R/Z. However the 
orbit (g(n)r) ne [Ar] does not approximate a linear orbit on this torus; rather, it has 
quadratic behaviour. Thus (g(n)T) n€ ^ is very close to (g , (n)T') nE ^ N j on G'/T' = R/Z, 
where 



G' : 
V 



1 R 
1 

1 

1 z 
1 
1 



9'(n):=(lTt)- (1.3) 



and 

Thus, in order to approximate the linear sequence (g(n)T) ne [N] by a lower-dimensional 
sequence, the latter sequence needs to be polynomial. Note however that if one had 
the luxury of passing from [N] to a much shorter progression, e.g. [iV 1 / 100 ], then the 
lower-dimensional sequence would remain linear. In the limit N — > oo, iV and jV 1 / 100 
both go to infinity, which may help explain why in the qualitative setting one can avoid 
polynomial sequences entirely and work purely in the category of linear sequences. 
Unfortunately, for the quantitative applications we have in mind (in particular, the 
number-theoretic application in [13]) we cannot afford to reduce the scale N in such a 
drastic mannei^. 

In much the same way that Theorem 11.91 could be iterated in order to establish Corol- 
lary 11.121 we can iterate Theorem 11.161 to obtain a quantitative factorization theorem. 
To state it we need quantitative versions of the "rationality" concepts of Definition II. Ill 
and also the new notion of smooth sequences, which must be introduced in place of 
constant sequences in the finitary setting. 

Definition 1.17 (Rational sequences, quantitative definitions). Let G/T be a nilmani- 
fold and let Q > be a parameter. We say that 7 G G is Q -rational if Y G T for some 
integer r, < r ^ Q. A Q-rational point is any point in G/T of the form 7r for some 
Q-rational group element 7. A sequence {^{n)) ne % is Q-rational if every element 7(n)T 
in the sequence is a Q-rational point. 

Definition 1.18 (Smooth sequences). Let G/T he a. nilmanifold with a Mal'cev basis 
X. Let (e(n)) ne z be a sequence in G, and let M, N ^ 1. We say that (e(n)) ne z 
is (M,N)-smooth if we have d(e(n), id G ) < M and d(e(n),e{n - 1)) < M/N for all 
n G [N], where the metric d = dx on G will be defined in Definition 12.21 



This is ultimately because it is known how to obtain non-trivial control on averages of number- 
theoretic functions such as the Mobius function fj, on intervals such as [N, N + N\og~ A N], but not in 
intervals such as [N,N + iV 1 / 100 ], even if one assumes strong hypotheses such as GRH. 
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Note that the notion of a (M, iV)-smooth sequence collapses to that of a constant 
sequence in the limit N — > oo (holding M fixed). 

Theorem 1.19 (Factorization theorem). Let m, d ^ 0, and let Mq,N ^ 1 and A > 
be real numbers. Suppose that G/Y is an m- dimensional nilmanifold together with a 
filtration G, of degree d. Suppose that X is an M -rational Mal'cev basis X adapted to 
G. and that g G poly(Z,G.). Then there is an integer M with M Q < M < M ° A ' m ' d{1) , 
a rational subgroup G' C G, a Mal'cev basis X' for G'/Y' in which each element is 
an M -rational combination of the elements of X , and a decomposition g = eg'^j into 
polynomial sequences e, g', 7 G poly(Z, G m ) with the following properties: 

(i) e : Z ->■ G is (M, N) -smooth; 

(ii) g' : Z — > G' takes values in G' , and the finite sequence (g'(n)Y') n ^] is totally 
1 / 'M A -equidistributed in G'/Y', using the metric dx> on G'/Y'; 

(iii) 7 : Z — > G is M -rational, and ( r )(n)Y) n( zz is periodic with period at most M . 

Remark. In words, this corollary asserts that in the quantitative setting, one can 
decompose 

(arbitrary polynomial sequence) = (smooth) x (totally equidistributed) x (periodic). 

The notion of a subgroup G' being M-rational relative to a Mal'cev basis X will be 
defined in Definition 12.51 This result has some faint resemblance to the Szemeredi 
regularity lemma [30], although with the key difference that our bounds here are all 
polynomial in nature. 

The derivation of Theorem 11.191 from Theorem 11.161 will be performed in ^SlfTUl 

We will use Theorem 11.191 in [T3] in order to establish the Mobius and Nilsequences 
conjecture MN(s) from [T2] for arbitrary step s. For this application, it is important that 
all bounds here are only polynomial in M, and that the equidistribution is established 
on progressions of length linear in N (as opposed to N c for some small c > 0). 

Just as Corollary 11.121 implies a Ratner-type theorem, namely Corollary 11.131 it is 
not hard to deduce the following result from Theorem 11.191 

Corollary 1.20 (Ratner-type theorem for polynomial nilsequences). Let m,d ^ 0, 

< 5 < 1/2, and N ^ 1. Suppose that G/Y is an m- dimensional nilmanifold, that G, 
is a filtration of degree d on G, and that X is a 1/S-rational Mal'cev basis adapted to 
G,. Suppose that g e poly(Z, G m ). Then we may decompose [N] as a union PiU- ■ -UPj 
of arithmetic progressions with length fiO^A^N an d the same common difference q, 

1 ^ q ^-OmX 1 ) t such that each orbit (g(n)Y) n€ p i lies within 5 (using the metric dx) 
of XiG'y/Y /Y C G/Y, where Xi G G, yi G G is S~° m ' d ^ -rational, and G' is a closed 
subgroup of G which is b" ™-^ -rational relative to X (this notion will be defined in the 
next section). 

Remark. The reader may wish to compare this with [B], another recent result on 
quantitative variants of Ratner's theorem. 

Let us conclude this introduction by remarking that our main theorem actually ap- 
plies to multiparameter polynomial mappings g : Z* — > G. In the infinitary setting such 
a generalization was obtained by Leibman [23], and his result has subsequently been 



THE QUANTITATIVE BEHAVIOUR OF POLYNOMIAL ORBITS ON NILMANIFOLDS 11 

applied in such papers as [2 J and [24]. We have taken the trouble to derive multipa- 
rameter extensions of our main results with analogous fmitary applications in mind; see 
Theorems 18.61 and Theorem 110.21 

2. Precise statements of results 

In this section we define various "quantitative" concepts (such as Q-rational Mal'cev 
bases, subgroups which are Q-rational relative to such a basis and the metrics dx and 
dc/v) which were needed to properly state the main results from the introduction section. 
We also give a more precise version of Theorem I1.16[ which we will then spend the next 
several sections proving. 

Mal'cev bases and metrics on G/Y. The notion of Mal'cev coordinates play 
a vital role in the quantitative theory of nilmanifolds. They allow us to put a metric 
on G/Y, which in turn allows us to define the notion of equidistribution; they also 
quantify the "rationality" of various objects associated to the nilmanifold. Mal'cev 
coordinates were introduced in [25] , which contains a nice discussion; they are covered 
quite extensively in the book [4], particularly Chapters 1 and 5. We will also need 
several more quantitative statements about Mal'cev coordinates, which we have placed 
in Appendix [A] We recommend that the reader dip into that appendix as and when 
required. 

We will make use of the Lie algebra g of G together with the exponential map exp : 
g — > G. When G is a connected, simply-connected nilpotent Lie group the exponential 
map is a diffeomorphism; see [4J Theorem 1.2.1]. In particular, we have a logarithm 
map log : G — > g. One does not really need to have an understanding of the exponential 
and logarithm maps beyond some of their formal properties, which we will list as we 
need them, in order to understand this paper. 

Definition 2.1 (Mal'cev bases). Let G/Y be a m-dimensional nilmanifold and let G. 
be a filtration. A basis X = {Xx, . . . , X m } for the Lie algebra g over R is called a 
Mal'cev basis for G/Y adapted to G, if the following four conditions are satisfied: 

(i) For each j = 0, ... ,m — 1 the subspace := Span(X J+1 , . . . ,X m ) is a Lie 
algebra ideal in g, and hence Hj := exp \)j is a normal Lie subgroup of G. 

(ii) For every ^ i ^ s we have Gi = H m _ mi ; 

(iii) Each g <E G can be written uniquely as exp(tiXi) exp(t 2 X 2 ) . . . exp(t m X m ), for 
U e R; 

(iv) T consists precisely of those elements which, when written in the above form, 
have all £, G Z. 

Remarks. Our main results only make sense if the nilmanifold G/Y is already equipped 
with a Mal'cev basis X, since they involve quantitative dependencies that can only be 
described using such a basis. However it is a well-known result of Mal'cev [25] that any 
nilmanifold G/Y can be equipped with a Mal'cev basis adapted to the lower central 
series filtration. Indeed the very existence of a discrete and cocompact subgroup Y 
guarantees that the lower central series is rational by [4J Theorem 5.1.8 (a)] and (4J 
Corollary 5.2.2]. One may then apply (4J Proposition 5.3.2] to deduce the existence of 
a Mal'cev basis adapted to the lower central series. More generally there is a Mal'cev 
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basis adapted to any nitration G, which consists of rational subgroups (cf. Definition 

on}. 

We refer to the t { as the MaVcev coordinates of g, and we define the Mal'cev coordinate 
map t/> = ipx '■ G — > W n to be the map 

^((?):=(ti,...,t m ), (2.1) 

thus for instance T = ip -1 (Z m ) . If X' is another Mal'cev basis (relative to some filtra- 
tion) then we write ip' = ipx'- Only very occasionally will we need to use the notation 
ipy to indicate the coordinate map relative to some further basis 3^- 

Remarks. In the literature, Mal'cev coordinates are invariably discussed in the context 
of the lower central series filtration and are referred to as coordinates of the second kind. 
Coordinates of the first kind or exponential coordinates are derived by writing log g e g 
as a linear combination logg = s\Xi + . . . + s m X m of elements of the basis X, and 
we write ip e xp(g) = ipx,exp(g) '■= (si, • ■ • , s m ) for the coordinates of g obtained in this 
fashion. However, we shall mostly work using coordinates of the second kind. 

We can use a Mal'cev basis X to put a (slightly artificial) metric structure on G and 
onC7/r. 

Definition 2.2 (Metrics on G and G/T). Let G/Y be a nilmanifold with Mal'cev basis 
X. We define d = dx '■ G x G — > to be the largest metric such that d(x,y) ^ 
\ijj(xy~ l )\ for all x, y e G, where | ■ | denotes the £°°-norm on M. m . More explicitly, we 
have 

{71-1 
^min(|?/'(x i _ix 4 " 1 )|, \ip(xiXi\)\) : x , . . . , x n e G; x = x; x n = y 
i=0 

This descends to a metric on G/Y by setting 

d(xY, yY) := mf{d(x', y') : x', y' G G; x' = a;(mod Y);y' = y(mod Y)}. 

It turns out that this is indeed^ a metric on G/Y; this essentially follows from the 
discreteness of Y in G, and we will prove it in Lemma [A. 151 Since d is right-invariant, 
we also have 

d(xY, yY) = inf d(x,yj). 
7er 

When the letter d is used for a metric, it will always denote the metric dx relative to 
some basis X that is already under discussion. The symbol d! will be used for the metric 
defined using some other basis X' . On the very rare occasions (for example in the proof 
of Lemma I7.4p where the metric relative to some further basis is under consideration 
we will indicate this explicitly using subscripts. 

Quantitative rationality. Now we define the concept of rational nilmanifolds 
and subgroups. 



4 We note that this metric structure is a little more specific than in some of our previous papers, 
notably that in [121 §8]. This will not cause any difficulty, as the metrics in that paper are equivalent 
to the one given here, up to constants depending on G, T and X. Indeed, at small scales d agrees with 
the distance function given by the unique right-invariant Riemannian metric on G whose value at the 
origin is equal to that of the Euclidean metric at the origin of K m , pulled back by tp; see also Lemma 

El 
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Definition 2.3 (Height). The height of a real number x is defined as max(|a|, if 
x = a/6 is rational in reduced form, and oo if x is irrational. 

Definition 2.4 (Rationality of a basis). Let G/Y be a nilmanifold and Q > 0. We say 

that a Mal'cev basis X for G/r is Q-rational if all of the structure constants in the 
relations 



are rational with height at most Q. 

Definition 2.5 (Rational subgroups). Suppose that a nilmanifold G/Y is given together 
with a Mal'cev basis X = {Xi, . . . , X m }, and that Q > 0. Suppose that G 1 C G is a 
closed connected subgroup. We say that G' is Q-rational relative to X if the Lie algebra 
q' has a basis X' = {X[, . . . , X' m ,} consisting of linear combinations YliLi a iXi, where 
ai are rational numbers with height at most Q for all %. 

Definition 2.6 (Modulus of a horizontal character). Suppose that G/Y is a nilmanifold 
with a Mal'cev basis X. Suppose that r\ : G — > R/Z is a horizontal character, that is 
to say a homomorphism from G to R/Z which annihilates T. Then, when written 
in coordinates relative to X, properties (iii) and (iv) of Proposition 12.11 imply that 
rj(g) = k ■ ip(g) for some unique k G Z m . We write \r]\ := \k\. 

Smooth polynomial sequences. For technical reasons it will be convenient to 
quantify the smoothness of sequences, such as the sequence e(n) appearing in Theorem 
I1.19[ in a slightly different manner from that used so far. 

Definition 2.7 (Smoothness norms). Suppose that g : Z — > R/Z is a polynomial 
sequence of degree d. Then g may be written uniquely as 



where «j is in fact equal to d l g(0). For any iV > we define the smoothness norm 



The smoothness norm || • Hc 00 ^] is designed to capture the notion of a polynomial 
sequence which is slowly-varying. Indeed, the following lemma is easily verified: 

Lemma 2.8 (Smooth polynomials vary slowly). Let g : Z — > R/Z be a polynomial 
sequence of degree d, and let N > 0. Then for any n 6 [N] we have 



In view of this lemma, we see that Theorem II. 161 will be an immediate consequence 
of the following more precise statement. This is in fact the main technical result in our 
paper and we will use it to derive all our other main results. 

Theorem 2.9 (Quantitative Leibman theorem). Let m, d ^ 0, < 5 < 1/2 and 

N ^ 1. Suppose that G/Y is an m- dimensional nilmanifold together with a filtration G, 
and that X is a ^-rational Mal'cev basis adapted to G.. Suppose that g G poly(Z, G # ). 




k 




g\\c°°[N] ■= sup N 3 \\aj\\^/z. 



g(n) - g(n - 1)||m/z <d ttIMIc-^- 
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U (5'( n )r)ne[A r ] i s n °t S-equidistributed, then there is a horizontal character rj with < 
\r}\ <C fi-OmA 1 ) such that 

\\v ° g\\c°°[N] < 5-° mAt) . 

Notes ON reading the paper. As with so many papers, some parts of this work 
are merely technical and other parts represent deeper ideas of greater interest. There 
are quite a number of computations in this paper in which one has to show, say, that a 
certain integer is bounded polynomially by another, or that a certain basis is 0(5~°^)- 
rational. All such computations are of the technical variety and should certainly be 
ignored on a first reading. They are all in a sense "clear" ; their proofs proceed by algebra 
of a type which could hardly be expected to introduce non-polynomial dependencies. 
It is possible that this could even be encoded in some relatively soft "proof-theoretic" 
language, but we have chosen not to follow such a path. 

We begin with several sections containing motivating examples. In §3] we will discuss 
linear flows on tori M m /Z m , in §H] we shall discuss polynomial flows on M/Z, and in £J5] 
we will look at linear flows on the 2-step Heisenberg nilmanifold (II. ip . Some lemmas 
from these sections will be required in the sequel. 

We then begin the study of the general case. In §|6] we study the algebraic properties 
of polynomial sequences on nilpotent groups following Lazard and Leibman. There 
is a rich general theory here which is not evident from the study of the abelian and 
Heisenberg examples. 

We then turn to the full proof of Theorem I2.9[ the quantitative Leibman theorem. 
This is the technical heart of the paper and is given in the (rather long) £0 

In §8] use a straightforward iteration argument to bootstrap Theorem 12.91 to a multi- 
parameter version of itself, namely Theorem 18.61 In §|9] we then establish a preliminary 
multiparameter factorization theorem, Proposition 19. 2^ which is a fairly short conse- 
quence of Theorem 18.61 In §fTUl we then iterate this proposition, obtaining a multipa- 
rameter theorem (Theorem I10.2p which then easily implies Theorem 11.191 (and hence 
Corollary ll.20p as special cases. 

The appendix contains basic results on bases and nilmanifolds. 

There is unfortunately a large amount of notation in this paper. In Figure [1] the key 
objects in the argument are briefly described. 

3. A quantitative Kronecker theorem 

In this section we prove Theorem 12.91 for linear sequences on the torus M m /Z m , that 
is to say we establish a quantitative Kronecker theorem. The methods and the result 
are very standard. 

Proposition 3.1 (Quantitative Kronecker Theorem). Let m ^ 1, let < 5 < 1/2, 
and let a G M m . // the sequence (cm(mod Z m )) ne [7v] is not 5-equidistributed in the 
additive torus R m /Z m , then there exists k G Z m with < \k\ S~° m ^ such that 
• a|| R/z < (T^W/iV. 

Remark. We leave it to the reader to check that this really is the specialization 
of Theorem 12.91 to the case of linear orbits on the torus M m /Z m . This may be found 
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G 


nilpotent group 


Definition |1.1| 


G, = (Gj)°^Q 


filtration on G 


Definition |1.1| 


G/T 


nilmanifold 


Definition |1.1| 


(G/TU = G/[G,G]T 


horizontal torus 


Definition |1.5| 


G d /(TnG d )^R md /Z md 


vertical torus 


Definition 13.31 


d^O 


degree of the filtration G. 


Definition |1.1| 


s^O 


step of G 


Definition |1.1| 


m ^ 


dimension of G 


Definition ll.ll 


rrii 


dimension of Gj 


Definition ll.ll 




dimension of horizontal torus 


Definition ll.5l 
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m — rri2 


m 
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nonlinearity degree of G. 


m 


f] : G E/Z 


horizontal character 


Definition |1.5| 


£ : G d R/Z 


vertical character 


Definition ET31 


* = (X 4 )£l, A" = (XO™! 


Mal'cev bases 


Definition 2.1 




coordinate maps relative to X, X' 






metrics defined using X,X' 


Definition |Z51 


g ^ 1 


rationality bound for X (usually Q = 1/S) 


Definition |Z31 


vr : g -> (G/r) ab 


projection onto the horizontal torus 


Definition 11.51 


F : G/r -)■ c 


Lipschitz function 


Definition |Z51 


< 5 < 1/2 


level of equidistribution 


Definition |T2| 


X ^ 1 


length of sequence 


Definition Ol 


# : Z ->■ G 


a polynomial sequence 


Definition [T8| 


poly(Z,G.) 


polynomial sequences with coeffs in G. 


Definition OJ 


t ^ 1 


number of parameters 


m 



Figure 1. A list of key objects in the paper, together with brief descrip- 
tions of these objects, and the location where they are first defined or 
introduced. 



helpful in understanding some of our notation. Note in particular that in this case the 
horizontal torus is simply R m /Z m , and we may take tt to be the identity map. 

Proof. By Definition 11.21 there is a Lipschitz function F : R m /Z m — > R such that 

\E ne[N] F(an(mod Z m )) - f F d6\ > 5\\F\\ Lip . (3.1) 

Jr™ /z m 

At the expense of replacing 5 by 5/2 we may translate F, add a constant to it and rescale 
in such a way that J F = and ||.F||Lip = 1- By approximating F by smooth functions 
we may assume that F is smooth (we do this to avoid any technical issues regarding 
convergence of Fourier series). We now use a standard manoeuvre to approximate F by 
a function which has finite support in frequency space (cf. [Tlj Lemma A.9]). 

Consider the Fejer kernel K : R m /Z m — y R + defined by 

mes^j mes(y) 

where Q := [— j§^] m C R m /Z m is a small cube, and * denotes the usual convolution 
operation on the torus R m /Z m . It is immediate that K is a non- negative function 
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supported in Q with 

1 K = l. (3.2) 



/ 



A simple calculation also establishes the estimate 

\K(k)\ ^r^M- 1 (3.3) 

fceZ m :|fc|>M 

for all M > 1, where the Fourier coefficient is defined by 



K(k) := / K(6)e(-8 ■ k) d6 
and e(x) := e 2mx is the standard character on R/Z. We also have the crude bound 

|%)|^||F||oo^||F||Lip^l (3.4) 

for all k G Z m . 

Set F 1 := F * K. Since Hi^Hup — 1> an d -ft" is supported in Q and satisfies (13. 2p . a 
standard computation shows that 

IIF-Filloo ^ 5/8- 
Choose M := C m 5~ 2m_1 for some suitably large C m , and set 

F 2 (9):= Fi{k)e{k-9). 

k& m :0<\k\^M 

Noting that Fi(0) = 0, facts (13. 3p . (13. 4p and the Fourier inversion formula imply that 

\\F 1 -F 2 \\ 00 ^5/8. 

It follows that ||F — i^Hoo ^ 5/4, which means in view of the failure of (13. ip that 

\E ne[N] F 2 (naZ m )\ > 6/4. 
Applying (13. 4p once more we see that there is some k, < \k\ ^ M, such that 

\E ne[N] e{nk ■ a)\ > m 5M m > 5° mW . 
The result now follows immediately from the standard estimate 

N\\t\\ m ) ' 

which follows from summing the geometric progression. □ 

Let us now record a corollary of the m = 1 version of this result which will be 
used several times in the sequel. This gives stronger information in the case that 
(rza(mod Z)) n6 [jv] is very far from being equidistributed. 

Lemma 3.2 (Strongly recurrent linear functions are highly non-diophantine) . Let a G 

R, < 5 < 1/2, and < e ^ 5/2, and let I C R/Z be an interval of length e 
such that an G / for at least SN values of n G [N] . Then there is some k G Z with 
< |fc| < such that ||Hk/2 < e6-°W/N. 



E ne[N] e(nt)\ < min ( 1, 
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Proof. Taking F to be a Lipschitz approximation to the interval /, we see immediately 
that our assumption precludes (cm(mod Z)) ne [jv] from being <5 10 -equidistributed. It 
follows from the case m = 1 of Proposition 13.11 that there is some k G Z, \k\ <C 5~ c , 
such that ||fc«|| R / z <C 5~ C /N, where C = 0(1). Write (3 := ||ga:||R/z- Let n G Z be 
arbitrary, and suppose that n/ ranges over any interval of integers J of length at most 
1/(3. The number of n' for which a(no + gn')Z G / is then at most 1 + e//3. Since [iV] 
may be divided into ^ 2q + /3AT progressions of the form {no + qn' : n' E J} we obtain 
from our assumption the inequality 

SN <: #{n G [AT] : anZ G /} ^ (1 + |)(2g + (3N) < g + ^ + /3 V + eiV. (3.5) 

Now the lemma is trivial if N <C 5 _10C and follows immediately from Proposition 13.11 
when e ^> 5 10C , so suppose that neither of these is the case. Then all of the terms 
except the second on the right-hand side of (13. 5p are negligible, and we deduce that 

SN < qe/f3. 

This immediately implies the result. □ 

The main idea in the proof of Proposition 13.11 of course, was that the space of 
Lipschitz functions is essentially spanned by the space of pure phase functions e(k ■ 9). 
Thus we were able to assert that if the condition (13. ip fails for some F, then it also fails 
(albeit with a smaller value of 5) for a pure phase function with not-too-large frequency. 

A similar observation turns out to be essential in the analysis of polynomial sequences 
on general nilmanifolds G/Y (cf. the proof of [22j Theorem 2.17]). Though we will not 
be discussing general sequences for quite a while, this does seem to be an appropriate 
place to state and prove a lemma which generalizes the observations just made. For 
this, we will be working primarily on the vertical torus: 

Definition 3.3 (Vertical torus). Suppose that G/Y is a nilmanifold and that G. is a 
filtration of degree d. Note that Gd then lies in the centre of G. We define the vertical 
torus to be Gd/(Y fl Gd), and the vertical dimension m d to be m d := dimG d ; the last 
wid coordinates of the Mal'cev coordinate map ip may be used to canonically identify 
Gd and Gd/(Y DGd) with M md and W nd /Z md respectively. Also observe that the vertical 
torus acts canonically on the nilmanifold G/Y, thus we can define^ By G G/Y for all 
9 G R^/Z™* and y G G/Y. 

Definition 3.4 (Vertical characters). A vertical character is a continuous homomor- 
phism £ : Gd — > M./Z such that r C\Gd Q ker£ (in particular, £ can also be meaningfully 
defined on Gd/Yd — W rid /Z md ). Any such character has the form £(x) = k ■ x for a 
unique k G Z md , where we identify Gd with M. md . We refer to k as the frequency of the 
character £, and |£| := \k\ as the frequency magnitude. For instance the trivial character 
£ = has frequency 0. 

Definition 3.5 (Vertical oscillation). Let F : G/Y — > C be a Lipschitz function and 
suppose that £ is a vertical character. We say that F has vertical oscillation £ if we 
have F(g d ■ x) = e(£(g d ))F(x) for all g d G G d and x G G/Y. 



Here we have a slight clash between the additive notation for the torus R md /Z md and the multi- 
plicative notation for the group G. We hope this will not confuse the reader. 
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The next definition is a repetition of Definition 11.2} except that we specialize to 
functions with a fixed vertical oscillation £. 

Definition 3.6 (Equidistribution along a vertical character). Let g : Z — > G be a 
polynomial sequence. We say that (g(n)Y) n€ ^ is 5-equidistributed along a vertical 
character £ if 

E ne[JV]J F(#(n)r) - / F <<5||F|| Lip 
./G/r 

for all Lipschitz functions F : G/r — > C with vertical oscillation £. 

The next lemma states that in order to check whether a sequence is equidistributed, 
it suffices to test that sequence against functions possessing a vertical oscillation. 

Lemma 3.7 (Vertical oscillation reduction). Let G/Y be a nilmanifold together with 
a filtration G, of degree d. Let be as above, and let < 5 < 1/2. Suppose that 
g : Z — )■ G is a polynomial sequence and that {g(n)T) ne \m is not 5-equidistributed. 
Then there is a vertical character £ with |£| <C 5^° rn di 1 ) such that (g(ri)Y) n€ m] is not 
5° m dW -equidistributed along the vertical oscillation £ . 

Proof. We merely sketch this, for the argument is little more than a repetition of that 
used to prove Proposition 13.11 We begin with the same reductions. That is, assuming 
the existence of an F : G/Y — > C such that 



E ne[N] F{g(n)Y) 



G/r 



> 5\\F\ 



Lip; 



(3.6) 



we weaken 5 to 5/2 and assume that f G , T F = 0, that ||-F||Lip = 1 and that F is smooth. 

Let K be the same Fejer-type kernel as before, and now take i*\ : G — > C to be the 
function obtained by convolving with K in each Gd/{Y D G d ) = MJ nd /Z md -fibre, that is 
to say 



F(6y)K(d)dd. 



•"■■< /Z m d 



Fourier expansion on 



where 



i md /Z md gives 



F 1 (y)= F A (y;k)K(k), 
F A (y;k) := [ F{6y)e{-k ■ 6)d6. 

J~R m d /Z m d 



Now for g d eG d = R md we have 



F A (g d y, k) = / F((6 + g d )y)e(-k ■ 9) d9 = e(k ■ g d )F A {y- f), 



thus each function F A (y; k) has vertical oscillation £, where £(x) := k ■ x is the vertical 
character with frequency k. 

Using exactly the same estimates as in the proof of Proposition I3.1[ we have \\F — 
F 2 \\oo ^ V 4 , where 

F 2 (y):= F A (y;k)K(k) 

k& m d-.\k\^Q 
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for some Q = C md 5~ 2nid ^ 1 . The rest of the argument proceeds exactly as before, and 
we see that if we take F(y) := F A (y; k) for suitable k G Z md , \k\ <C 5~° m <i( l \ we have 



E ne[N] F{g{n)T) — f F 
Jg/f 



> S o ^d0-)\\F\ 



hip- 



Thus (5'(w)r) ne [7v] is not 5 0m dW-equidistributed along the vertical character £, as de- 
sired. □ 



4. The van der Corput trick and polynomial flows on tori 

In the last section we introduced one important trick - the idea of decomposing 
a Lipschitz function into phases using Fourier analysis. In this section we introduce a 
second trick - namely, the use of van der Corput 's inequality - and use this trick to study 
polynomial sequences on tori R m /Z m . Although our language is somewhat different, 
this is really just a reprise of the standard theory of Weyl sums as used for instance in 
the study of Waring's problem (see, for example, [S3])- 

Lemma 4.1 (van der Corput inequality). Let N,H be positive integers and suppose 
that (a„)„ e nvi is a sequence of complex numbers. Extend (a n ) to all of Z by defining 
a n := when n ^ [N]. Then 



i |2 N + H 

|JE ne [jv]a n | ^ 



HN 

\h\^H 



Proof. We have 

H-l 



n -H<n^N h=0 

Thus, applying the Cauchy-Schwarz inequality, we have 



Ei2 1 I \ -v i2 

a A = jh\ 1^ 2^ an+h \ 



-H<n^N h=0 



^N + H ^ |2 

^ H 2 1^ \2^ an + h \ 

-H<n^N h=0 

N + H ^ 

- H 2 1^ 1^ 2^ an+han+h 'i 

-H<nsCAf h=0 h'=0 

which is equivalent to the right hand side of the claimed inequality. □ 

We will use the following simple (and rather crude) corollary of this, which we phrase 
in the contrapositive. 

Corollary 4.2 (van der Corput). Let N be a positive integer and suppose that (a n ) n( z[N] 
is a sequence of complex numbers with \a n \ ^ 1. Extend (a n ) to all of by defining 
a n := when n [N]. Suppose that < 5 < 1 and that 
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Then for at least 5 2 N/8 values of h £ [N] we have 

\^ne[N]0'n+h^\ ^ 5 2 /8. 

Proof. The result is vacuous if N ^ A/5 2 , so assume this is not the case. Suppose for 
a contradiction that the result is false. Apply Lemma [4.11 with H = N. Then it is easy 
to see that we have 




where we have used the trivial estimate \E ne \j^a n a n+ h\ ^ 1 for those h £ [N] such that 
\K n& [is[]a n a n+ h\ ^ 5 2 /8, of which there are no more than 5 2 N/8. Rearranging and using 
the fact that N > A/5 2 we see that this is a contradiction. □ 

The next proposition is the main result of this section, and is Theorem 12.91 in the case 
G = R, T = Z and with g : Z — > G an arbitrary polynomial. 

Proposition 4.3 (Weyl). Suppose that g : Z — > R is a polynomial of degree d, and let 

< 5 < 1/2. Then either (g(n)(mod Z)) ne [jvi is 5-equidistributed, or else there is an 
integer k, 1 ^ k < 5~° d(1) ? such that \\kg(mod Z)\\c°°[N] < <5~° d(1) . 

We will deduce this from the following, which is nothing but a reformulation of Weyl's 
exponential sum estimate (see e.g. [33J ) . 

Lemma 4.4 (Weyl's exponential sum estimate). Suppose that g : Z — )■ R is a polynomial 
of degree d with leading coefficient a d and that 

\^ne[N]e(g(n))\ ^ 5 

for some < 5 < 1/2. Then there is k £ Z ; <C 5 _0d ^ 1 - ) , snc/i £/ia£ 

||A;a d ||R/z<r 0d(1) /iV d . 

Proof. We proceed by induction on d, the result having been established in §|3] in the 
case d = 1. We may assume that iV > 5 d for some large C' d since the result is trivial 
otherwise. Applying van der Corput's estimate in the form of Corollary 14.21 we deduce 
that there are ^> 5 2 N values of h £ [N] such that 

\^ne[N]e(g(n + h) - g{n)) \ > 5 2 . 

For each such h, g(n + h) —g(n) is a polynomial with degree d—1 and leading coefficient 
hdctd- Thus by the induction hypothesis there is, for ^> 5 2 values of h £ [N], some 

1 ^ Qh t^ -0 ^ 1 ) such that we have 

\\hq h da d \\ m <5-°^/N d - 1 

for each of these values of h. Pigeonholing in the qh, this implies that there is q, 
1 < q < <5~° d(1) , such that 

IMI W -<W « S-O'W/N*- 1 

for > 5° d WiV values of A e [AT]. Since N is so large, Lemma 13.21 may applied to 
conclude that there is q' <C 5~° d ^ such that 

\\qq'a d \\ m « 5-°^/N d . 
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Redefining q := qq', the result follows. □ 

Proof of Proposition \4-3[ In this proof we allow all implied constants to depend on 
d. Suppose that g : Z — > R is a polynomial sequence of degree d such that the orbit 
(g(n)Z) ng [jv] on R/Z is not 5-equidistributed. Expand g as a Taylor series 

= f j ) a <H h(^)ai + a (4.1) 



and suppose as a hypothesis for induction on r, ^ r < d, that we have shown that each 
of the coefficients otd,otd-i, ■ ■ ■ ,ad-r is nearly rational in the sense that ||<7a<i_i||iR/z <C 
5~°^' /N d ~ l for some q <C 5~°^ for z = 0, . . . ,r. (The implied constants in the 0() 
notation may increase with each induction step, but there are only d such steps, and 
we are allowing these constants to depend on d, so this is harmless.) The statement we 
are trying to prove, Proposition 14.31 is the case r — d — 1. 

Now by the argument used in proving Proposition 13.11 (or indeed by simply quoting 
Lemma I3TTI) . there is k £ Z, < \k\ <C 5"°^, such that 

\E ne[N] e(kg(n))\^>5°^. (4.2) 

The base case r = of the induction follows immediately from Lemma 14.41 Suppose 
now that we have established the result for some r, and wish to establish it for r + 1. 
Set 

g\n) := g(n) - \ \\oi d \d-r) ^ = (d - r - l) + " ' + a °' 

Set Q := qd\, and write a^-i = a-d-i/q + 0(S~°^ r> /N d ~ l ), i = 0, . . . , r for some integers 
Od-i- For any n £ Z for any n! £ Z we have 

g'(n + Qri) - g'(n ) = g(n + Qn') - g(n ) - i a d _, (^°^®™ J - \J^_ J 



Set AT' := \_8 c <iN\ for some suitably large C' d and suppose that n' £ [AT'] and also that 
|rio| ^ 2N . Then the last term here is 0(5 c 'd~ ^ 1 '). The first term is an integer, since 

("r)-(7)4(')C-^° (mod9) 

for all j ^ d. Thus we see that if n' £ [A 7 ] and |no| ^ 2 A" then 

</(no + Qn') - <7'(no) = <?(n + Qn') - <?(n ) + 0(^-° (1) )(mod Z). (4.3) 

Splitting [A/ - ] into progressions of common difference Q and length [N 1 ] plus a negligible 
error we see from (14.21) that there is no, |no| ^ 2A^, such that 

\E n , E[NI] e(kg(n + Qn'))\ >5°^. 

It follows from (Q that 

|E„, 6[jV , ]e (VK + Qn'))| 
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By Lemma 14.41 we see that the leading coefficient a' := kQ d r 1 ad- r -i/{d — r — 1)! 
of this polynomial is nearly rational in the sense that there is 1 ^ q' <C 5~°^ such 
that II^VUk/z < <^° (1) /N d ~ r ~ 1 . It follows that there is 1 < q" < such that 

||g"ad-r-i||]R/z <C 8~°^ x ' / N d ~ r ~ x . Setting q := qq" we now clearly have 1 ^ q <C <5~ 0(1 ) 
and also \\qa d _i\\ R/Ij < tf-^W/jV*-* for i = 0, . . . , r + 1. 

This concludes the proof of the inductive step and hence of the proposition. 

We will also need a "strong recurrence" result for polynomials g : Z — > R, generalizing 
the linear result, Lemma 13. 2[ that we obtained in the last section. This is in fact an 
easy deduction from the Proposition 14.31 and Lemma 13.21 

Lemma 4.5 (Strongly recurrent polynomials are highly non-diophantine). Let d ^ 0, 
and suppose that g : Z — > R is a polynomial sequence of degree d. Suppose that < 5 < 
1/2 and e ^ 5/2, that I C R/Z is an interval of length e, and that g(n)(mod Z) £ I 
for at least 5N values of n E [N]. Then there is a k E < \k\ 5~° d ( 1 \ such that 
||fc<7(modZ)|| c ~ [JV ] <e5-°*W. 

Proof. In this proof we allow all implied constants to depend on d. If e 3> 5 d for some 
large depending only on d then the result follows immediately from Proposition 14.31 
so assume this is not the case. Expand g in a Taylor series as in (14. ip . with coefficients 
«o, • • • , otd- It follows from the assumption that none of the polynomials Xg, A ^ 8/2e, is 
#°W-equidistributed on [N]. Thus by Prop osit ion 14 . 3 1 we have see that for each A ^ 5/2e 
there is q\ "C 5~°^ such that ||?AAo;i||]R/z <C 8~°^' /N l for i — 0, . . . , d. Pigeonholing 
in the possible values of q\ we see that there is q <C 6~°^ such that for ^> 5°( 1 '/e 
values of A ^ 5/2t we have | ] Agotf 1 1 r/z <C 5~°^/N % for each i = 0, . . . , d. It follows 
from Lemma l3~2l that for each i there is <C J -0 ^ 1 ) such that ||5ia»||R/z <C e5~ Cd /N l . 
Writing g := gi . . . q& we see that q <C t^ -01 - 1 ) and that ||gai||R/z <C e5~°^ /N l for all i. 
This concludes the proof of the proposition. □ 

5. The Heisenberg example 

In this section we discuss the first example which is not just a rephrasing of classical 
work on equidistribution, establishing Theorem 12 .91 for a linear sequence on the Heisen- 
berg nilmanifold ( II. ip . thus s = d = 2, and m = 3. Strictly speaking, this section is not 
necessary in order to prove Theorem 12.91 in the general case, however we present this 
"worked example" here in order to illustrate the key ideas of the main argument in a 
simplified model setting. (Also, a key computation in this setting, namely Proposition 
15. 3[ will be reused in the main argument.) As in the preceding section, the idea is to 
use van der Corput's inequality to reduce the problem to a simpler problem, and in 
particular to reduce to a "1-step" or "abelian" problem that can be treated by the tools 
of the previous section. This turns out to work, but it will take a certain amount of alge- 
braic manipulation to see the 1-step structure emerge from van der Corput's inequality 
applied to the 2-step Heisenberg situation. 

Let us begin with a brief tour of the Heisenberg example (11.11) . We have g = o o m , 
with the exponential map being given by 

/ x y\ fix y+jxz \ 



THE QUANTITATIVE BEHAVIOUR OF POLYNOMIAL ORBITS ON NILMANIFOLDS 23 

and the logarithm map by 

1 x y\ /Ox y—^xz ' 



/ 1 x y\ (Ox y-^xz\ 

log \ U f J = I 2 J • 

\ o o i / \oo o / 

Observe that logT is not quite a lattice in M 3 , although it is a finite union of lattices. 

Consider the elements X 1 ,X 2 ,X 3 G defined by X\ := o o V X 2 := f o jj lj and 

X3 := f q V It is easy to see that X = {Xi, X 2 , X3} is a Mal'cev basis adapted to 
the lower central series filtration G 9 . A simple computation confirms that 

expfoXi) exp(t 2 X 2 ) exp(t 3 X 3 ) = ( *i tl *| 2 f * 3 J ; 

and so the Mal'cev coordinate map ipx '■ G — > M 3 is given by 

ipx [III) = {x,z,y-xz). 

The horizontal torus is isomorphic to (IR/Z) 2 , and the projection 71 : G — > (R/Z) 2 is 



, ( 1 x v 
given by 7r 01* 
Vo 1 



lx,z) 



We shall be working through the special case of Theorem 12.91 in the case when g : 
Z — > G is a linear sequence. To simplify the exposition very slightly we will assume 
that this sequence has no constant term, thus g(n) = a n for some a G G. Note that 
g G poly(Z, G u ), where G, is the lower central series filtration. Thus the sequence g has 
degree 2. 

Proposition 5.1 (Main theorem, Heisenberg case). Let G/T be the 2-step Heisenberg 
nilmanifold with the Mal'cev basis X described above, and let g : Z — )■ G be a linear 
sequence of the form g(n) = a n . Let 5 > be a parameter and let N ^ 1 be an integer. 
Then either (g(n)T) ne ^ is 5 -equidistributed, or else there is a horizontal character n 
with < < <^ 0(1) such that ||»7(a)|| tt /z < <5~° (1) /iV. 

Remark. Note that, since g(n) is linear, the last condition here is equivalent to the 
statement that \\rj o 1 1 [iv] <^ 6~ ^\ 



Proof. By Lemma [3.71 we may assume that there is a function F : G/T — > C with a 
vertical oscillation £ with ||£|| <C 5~ 0(1 ), and ||-F||Li P = 1, such that 

E ne[N] F(a n T) - [ F (5.1) 
Jg/t 

We split into two cases: £ = and £ ^ 0. 

If £ = 0, then F is (^-invariant, which means we may factor through ix to get a 
function F : M 2 /Z 2 ->■ C defined by 

F(x) = F(tt(x)). 

It is clear that ||-F||Lip ^ 1- Equation ( 15. ip implies that 



\E ne[N] F(mr(a)) - [ F\ > ||F|| Lip . 

Jr 2 /z 2 



Proposition 15.11 in this case now follows immediately from Proposition 13.11 Note how 
the GVinvariance allowed us to reduce a 2-step problem into a 1-step one. 
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Suppose then that £ ^ 0. The integral of F over every translate of G 2 /(r D G 2 ) is 
then zero, and hence J G ^ r F = 0. Thus 05. 1H becomes 

\E n€[N] F(a n T)\^5°^. 

We now come to one of the key ideas of the proof, which is to apply the van der Corput 
lemma, Corollary 14.21 This tells us that there are 3> 8°^N values of h G [N] such that 

\E n€[N] F(a n+h T)F(^T)\ » 5°^. (5.2) 

It is very natural to try and interpret this in terms of a nilsequence on the product 
nilmanifold G 2 /T 2 . To do this we first observe by direct computation that any x G G 
may be factored uniquely as where ip({x}) G [0, l) 3 and [x] G T. 

Let us, then, factor a h = {a h }[a h }. The inequality ( 15. 2 p implies that 

\E ne[N] F(a n {a h }T)Fj^T)\ » 8°™ 

for > 6°V)N values of h. This can be rewritten as 

\E ne[N] F h (a n h T 2 )\^>5°V (5.3) 

for 3> 8°^N values of h, where F h : G 2 /T 2 — > C is given by 



F h (x,y) := F({a h }x)F(y) 

and the element dh is given by 

d h := ({a! 1 }' 1 a{a h ] , a). 

At first sight, the estimates (15. 3p do not appear much better than our original estimate 
(15. ip ; indeed, it seems "worse" since we are now working on a 6-dimensional 2-step 
nilmanifold rather than a 3-dimensional 2-step one. 

The crucial observation, however, is that all the elements dh in fact lie not just in G 2 , 
but in the smaller group 

G D = Gx G2 G:={( M '):fVeG 2 }. 

This is also a 2-step nilpotent, connected, simply connected Lie group (of dimension 4). 
It is not hard to check that [G a , G D ] is the diagonal group := {(#2, Q-i) '■ 92 G G2}, and 
that one can take for a Mal'cev basis of G D /T D the collection X u = {X°, X 2 n , X 3 D , X^} 
given by 

□ /0 1{0,0}\ n /0 0{0,0}\ n /0 0{1,0}\ n /0 0{1,1} 

A, = 00 I, A 2 = 00 1 , A, = 00 ) and A 4 = 
Voo / Voo / Voo / Voo 



where we have written 



Ox{y,y'}\ Oxy\ ( x y' 

oo z := o o 2 , o o 2 
oo o / VVooo/ Vooo 



This allows us to identify the horizontal torus of G n /T n with M 3 /Z 3 by projecting onto 
the first three coordinates. 

Now ( 15. 3 p implies that for 3> 8°^N values of h we have 

|En e[ iv]^ n ((a°rr D )|»5 «, (5.4) 

where F° and a° are the restrictions of F h and dh to G n , and T n := T x rnG2 T. By 
inspecting the action of G\ on F° (and the hypothesis £ ^ 0) we also conclude that 
J G n/ T n F h = 0. 
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Now, the group G u is still 2-step nilpotent, so we do not appear to have reduced to 
a 1-step situation yet. However, recall that F has vertical oscillation £. Using this and 
the fact that g 2 is central in G, we obtain 

Fh((92,92) ■ (9,9')) = F{{a h }g 2 g)Fj^)= ^fe)W({a A } 3 )f(^ = F*((g,g')). 

Thus F^ is [G n , G n ] -invariant. In ( 15. 4p we may therefore factor through the projection 
7r n to obtain 

\E n€[N] F h (mr D (a h ))\^5°W 

for ^> 5°^N values of h, where the function F^ : R 3 /Z 3 — > C is defined by 

F h (n n (x)) = F*(xY a ). 

We leave it to the reader to check that ||i*/i||Lip = 0(1) (in the general case to follow 
this computation is given in more detail). Since F^ has mean zero, we see that F^ has 
mean zero also. 

We are now finally in a situation in which we may apply "1-step" tools. Indeed, from 
Proposition 13. II we see that for each h there is some e Z 3 , \k^ \ <C 5~~°^ such that 

\\^-7i a (a h )\\ m ^S-°^/N. 

Pigeonholing in h, we may assume that = k u is independent of h. Define 77 : G n — > 

R/Z by 

rj{x) := k D -ir n {x). 

Then 77 is an additive homomorphism which annihilates [G n , G n ] and r n , and we have 

||r7(a,)|| R/z «r°( 1 ViV (5.5) 

for > 8°P)N values of h e [N]. 

Our task now is to "piece together" these pieces of information for many different 
h to deduce Proposition 15.11 We begin by factoring the character 77 on G u into two 
simpler components, which originate from G (or G 2 ) rather than G u . 

Lemma 5.2 (Decomposition of 77). There exist horizontal characters r\\ : G — > R/Z and 
V2 '■ G2 — > R/Z on on G and G2 respectively (thus 771 annihilates Y and 77 2 annihilates 
T fl G2) such that 

ri{g',g) = vi(j9) + m{g'g~ 1 ) (5-6) 

for all (g,g r ) E G a . Furthermore we have \rji\, |t7 2 | <C 5^°^. 

Proof. Since n is an additive homomorphism we have 77^', g) = ^((g'g" 1 , 1) ■ (g, g)) = 
v(g,9) + v(g'g~ 1 A)- Thus if we define 771 (g) := r}(g, g) and 772(^2) := ??(^2, id G ) then ($3^ 
is immediately seen to hold. Now 771 is a horizontal character because 77 annihilates T D , 
which contains T A . Furthermore r D also contains (rnGy xidc and hence 772 annihilates 
r fl G2 as claimed. The bounds on \r]i\ and 1 772 1 are left as an exercise to the reader; 
one may compute explicitly with the Mal'cev bases X u and X on G n /Y n and G/Y 
respectively. □ 

Using this decomposition and the fact that, in the Heisenberg group, we have the 
identity x~ x yxy~ x = [x,y] since [x,y] is central, we see that 

r){a h ) = Vi( a ) + m([ a > i ah }})- 
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Now a straightforward computation with matrices confirms that if ip(x) = {ti,t 2 ,t 3 ) 
and ip(y) = (ui,u 2 ,u 3 ) then i()([x,y]) = (0, 0, t\U 2 — t 2 ui), and also that if ip(a) = 
(Ti>72, *) then ip({a h }) = ({71/1}, {^h}, *), where we do not care about the values of 
the coordinates marked with an asterisk *. Thus if we write 7 := (71,72) = 7r(a) and 
C := (-72,7i) then 

7](a h ) = ki ■ 7 + k 2 ( ■ {7/1}, 

where ki,k 2 = 0(S~ ^') are the frequencies of 771,772 respectively. Thus if (15. 5p holds 
then 

||*i • 7 + ^< ■ {tMIIk/z « S- 0(1) /N (5.7) 
for ^> 5°( l 'N values of h. The next proposition derives diophantine information con- 
cerning 7 and ( from a hypothesis such as this. In fact we handle a slightly more general 
situation, since this will be useful when we come to handle the general case of Theorem 
12.91 In the following proposition we shall take a = and m = 2; the proof when a = 
is actually considerably shorter and the reader may care to work through that case to 
better understand the argument. 

Proposition 5.3 (Bracket polynomial lemma). Let 5 G (0, 1) and let N ^ 1 be an 

integer. Suppose that a, (3 G R and that \a\ ^ 1/5N. Suppose that 7 G R m /Z m and that 
( G R m satisfies \(\ ^ 1/5. Suppose that for at least SN values of h G [N] we have 

\\^ + ah + C-{lh}\\ m ^l/5N. (5.8) 

Then either \Q\ <C m 5~° m ^ /N for all 1 ^ % ^ m, or else there is some k G Z m , 
1*1 <m 5-° m ^\ such that \\k ■ j\\ R/z < m 5-° m ^/N. 

Proof. If supj 101 ^ l/SN then we are done, so assume this is not the case. Then the 
assumption implies that \\(3 + ah\\^./z ^ (1 + m) sup^ \Q\ for ^ 5N values of h G [N]. 
Then Lemma I3T21 implies that there is q <C 5~ c such that ||ga||nyz <C m sup^ \Q\5~ C /N 
for some absolute constant C > 0. Since we are assuming that |a| ^ 1/5N this forces 
us to conclude that in fact |a| <C m supj \Q\8~ C /N unless N <C m <5 _ ° ( - 1 \ in which case 
the result is trivial in any case. 

Split [N] into intervals of length between N' and 2N', where N' := c m 5 c+1 N and 
c m > is a small number to be chosen later. By the pigeonhole principle, we can find 
one of these intervals / in which there are ^ 8\I\ values of h such that (15. 8p holds. If 
c m is chosen sufficiently small then ah does not vary by more than sup^ \Q\ on such 
an interval, and we conclude that there is 9 such that 

H0 + C-{7MllR/z<2oSup|Ci| + ^ 

for at least S\I\ values of h G I. Now if sup^ \Q\ ^ -M* then the proposition holds, so 
we may assume that this is not the case, in which eventuality we have 

P + ( ■ {lh}\\ m ^ ^\&\ (5.9) 
for some i G [to] and for at least 5\I\ values of h G I. We then set 

H:= {tG R m /Z m : \\9 + C • Wlk/z < } 
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and 

fi := {x G R m /Z m : dist(a;,fi) < 5/10}. 
For fixed u G R m /Z m the slice 

{t G Q : tj = Uj for j ^ i} 

is a union of intervals of length less than 5/2, and so vol(fi) ^ 5/2. Let F : R m /Z m — > 
M. + be the function 

„, , / 10distOr.fi) 
F(x) : = max ( 1 ^ V - ; , 

Then F = 1 on fi and so our assumption implies that 

E ne/ F( 7 n) ^ 5. (5.10) 
On the other hand F is supported on fi and so 

/ F{x) dx ^ vol(fi) ^ -. (5.11) 

</ R m /z m 2 



Thus of course 



|E n6/ F( 7 n) - / F(z) dx| ^ -. 

jR m /Z m ^ 



l /2 

However F has been constructed so that ||-F|| L ip 1/5 (we leave this as an exercise) 
and so we conclude that {^n)^ is not c5 2 -equidistributed. Applying Proposition 13. II we 
conclude that there is 1 < k < S' ^ such that ||fc-7|| R /z < 5~° m ^/N' < (J" " 1 ^)/^, 
and the claim follows. □ 



Recall that in our efforts to prove Proposition 15.11 had established the condition 
(15. 7p . Applying Proposition 15.31 and recalling that 7 = (71,72) and £ = (—72,71) 
we see that in all cases there is some nonzero k' G Z 2 with \k'\ 5~ 0( -^ such that 
\\k' ■ 7||r/ Z < 5-°W/N, that is to say ||Jfc' • 7t(o)||h/z < 5 _0(1) / Ar - This concludes the 



proof of Proposition 15.11 □ 

Let us pause for a moment to consider the form of the argument just presented. There 
were two places where we reduced matters to a simpler situation. First of all in the case 
£ = we were able to consider F as a function on a 1-step nilmanifold. Secondly when 
we applied the van der Corput trick we found ourselves with a function F^ which had 
as a vertical frequency, and so we were again able to reduce to the 1-step case, although 
we had to restrict the ambient nilmanifold (from G 2 /V 2 to G D /r n ) and also quotient 
out by a commutator group [G n , G a ] before the 1-step structure became manifest. This 
already makes it clear that some kind of induction is going on, and in the general case 
we will see this quite clearly. 



6. Polynomial sequences in nilpotent groups 



Our analysis of linear sequences on the Heisenberg example captured much of the 
essence of the proof of Theorem 12.91 in general. What it did not reveal, however, was 
the rather subtle structure of the space of polynomial sequences g : Z — > G. In this 
section we begin by establishing a remarkable result of Lazard [12] , which asserts that 
poly(Z, G,) is a group for any filtration G,. Lazard's proof uses the Lie algebra q and 
it works if G is a connected and simply-connected Lie group (as in the present paper). 
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However it turns out that the result is true with no topological assumptions on G, and 
indeed in the greater generality of so-called polynomial mappings from H to G, where 
H is an arbitrary group. This result is due to Leibman [2TJ (see also [2U] for a proof of 
the special case H = Z). 

We will then use the Lazard-Leibman results to derive sundry further results con- 
cerning the representation of elements of poly(Z, G m ) in coordinates. In fact, keeping 
in mind our intention to prove multiparameter results in £JHJ we develop the theory of 
polynomial maps poly(Z*, G,). 

Definition 6.1 (Polynomial maps). Let if be a group and let G be a nilpotent group 
with a filtration G,. If g : H — > G is a map and if h G H we write d^g for the map 
defined by dtg(x) = g^xfyg^x) -1 . We say that g is a polynomial map with coefficients 
in G, if we have ■ ■ ■ df ll g(x) G G{ for all choices of % and for all hi, . . . , hi G H and 
x G G. We write poly(if , G,) for the collection of all such mappings. If g : H — ^ G 
is a map we say that g is a polynomial sequence of degree at most d if there exists a 
filtration G, of degree at most d such that g has coefficients in G 9 . 

Proposition 6.2 (Lazard-Leibman theorem [21]). Let H be a group, letG be a nilpotent 
group, and let G, be a filtration. Then poly(if, G m ), the space of polynomial maps 
g : H — >■ G having coefficients in G 9 , is a group. 

Remarks. This result is contained in (21] (although the result is only stated in the case 
that G, is the lower central series filtration, the proof does not use this fact). Our proof 
is a little different, relying on the machinery of Host-Kra cube groups. These featured 
for the first time in [T5J §5, §11] and were discussed subsequently in [TSJ Appendix E]. 
See also the recent preprint [T7]. We thank Sasha Leibman for helpful conversations 
concerning these methods. 

One should mention at this point the Hall-Petresco theorem [El [27], which established 
a special case of the Lazard-Leibman theorem. This theorem states that if G, is the 
lower central series filtration then the sequence n h-> a n b n lies in poly(Z, G m ) for any 
a,b G G. 

In this section it is convenient to generalise the notion of a filtration somewhat. By 
a prefiltration G, on a nilpotent group G we mean a sequence 

G^GoDdD-OG^ {id G } 

of subgroups with the property that [G^ Gj] C G i+ j for all i, j ^ 0. The only difference 
between a prefiltration and a filtration (cf. Definition 11.11) is that we no longer require 
that G = Gq = G\. The definition of poly(if, G,) extends in a completely obvious way 
to prefiltrations. 

For each integer k ^ we are going to define the Host-Kra cube group HK fc (G.) 
associated to the prefiltration G,. This will be a subgroup of G^\ the product of 2 k 
copies of G indexed by the cube {0, l} fc . Before giving the definition, we need to set up 
some nomenclature concerning these cubes. 

Each element to G {0, l} h corresponds in an obvious way to a subset of [k], and we 
write uj C u' when the corresponding sets are nested. An upper face F is a subset of 
{0, l} k of the form F(u) ) := {uj G {0, l} fc : uj D uj }. There are, of course, 2 k upper 
faces, one for each ujq G {0, l} fc . The codimension codim(F) of F is simply the number 
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of ones in oj . Note that if F, F' are two upper faces then F n F' is also an upper face, 
and codim(F n F') ^ codim(F) + codim(F'). 

Given an upper face F and an element x £ G we write x F for the element of G^ 0,1 ^ 
defined by 

x if oj G F 
idc otherwise. 

Write IV for the subgroup of G^ ' 1 ^ consisting of all elements x F with x G G co di m (F)! 
where Gj is the ith group in the prefiltration G.; we call such a group an upper face 
group. 

Definition 6.3 (Host-Kra cube group). Let G, be a prefiltration on a nilpotent group 
G, and let k ^ be an integer. Then the Host-Kra cube group HK fc (G.) is the subgroup 
of G^ 0,1 ^ generated by the upper face groups Yp- 

The Host-Kra cube group can, it turns out, be described in a rather explicit way. 
Write -< for the reverse lexicographic ordering on {0, l} k , thus oj -< oj' if an only if 
there is some j such that ojj < oj'j and u>i — oj' { for i — j + 1, . . . , k. This induces an 
ordering on the upper faces F. We write F(oj) >- F(oj') if and only if oj -< oj'. Let 
F -< Fi -< • • • -< F 2 fc_! be the complete list of upper faces in this order; thus F = {l k } 
and F 2k _ l = {0, l} k . 

Lemma 6.4 (Description of Host-Kra cube group). We have 

HK fe (G.) = r Fo • r Fl • . ..Y F2k i . 

That is, every element o/HK fc (G) may be written as 7,f . . . l^-i w here 7» G G co dim(F 4 )- 
The representation is in fact unique. 

Proof. The key point here is the inclusion 

[Y F ,Y F ,]CY FnF ,. (6.1) 
This follows immediately from the fact that 

[G co dim(F), G co dim(F')] Q C co dim(F) +codim(i ?/ ) 

Using this fact repeatedly, we shift all elements coming from Y Fo to the left. We then 
shift all elements coming from Y Fl to the left, and so on. We leave the routine details 
and the proof that the representation is unique (which we do not actually need) to the 
reader. □ 

Host-Kra cube groups and polynomial maps. It is now time to develop the 
link between Host-Kra cube groups HK fc (G,) and polynomial maps g G poly(i7, G.). 
To do this we introduce a the notion of a parallelepiped on H . This is an element 
in H^ ' 1 } of the form (x/i w )u;e{o,i} fc > where x G H, h = (hi, . . . , hk) is a Axtuple of 
elements of H, and h u := h^ 1 . . . . For example the tuple (x, xhi, xh 2 , xhih 2 ) is a 
parallelepiped in H^ ' 1 ^ , and (x, xh\, xh 2 , xh\h 2 , xh 3 , xh\h 2 , xh 2 h 3 , xhih 2 hs) is a paral- 
lelepiped in H^ ' 1 ^ . Write for the set of parallelepipeds in H^ ' 1 ^ (if H is abelian 
is actually a group, but this need not be the case in general and in any case is not 
important here). 
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Suppose that g : H — > G is a map. Then for any k ^ there is an obvious induced 
map g^" : H^ k ^ G^" . 

Proposition 6.5 (Characterization of polynomial maps). Suppose that H is a group, 
that G is a nilpotent group together with a prefiltration G,, and that g : H — ^ G. Then 
g lies in poly (if, G,) if and only if g^ ' 1 ^ maps to BK k (G.) for all k ^ 0. 

Remark. The reader might find it useful, as an exercise to get to grips with the 
notation, to verify this in the case H = G and g being the identity mapping. 

We note that Proposition 16 . 21 is an immediate consequence of Proposition 16.51 Indeed 
if g^ k and gtW both map to BK k (G.) then so does (gg) {0 ' 1}k , since BK k (G.) 
is a group. 

Proof of Proposition \6.5[ We start by establishing the only if direction of the propo- 
sition, proving by induction on k that g^< l ^ k does indeed map #W to HK*(G.) when 
g G poly(H, G,). This is clear when k = 0. Suppose it is known for a given value of 
k ^ 0. If X is a set, we may regard X^ ' 1 ^ +1 as a product of two copies of X^ 0,1 ^ , 
the first factor corresponding to those u with Uk+i = and the second to those u with 
Uk+i = 1- With this notation, every z G H^ k+ ^ may be written z = (z,zhk+i), where 
z := {xh UJ ) uje { ^k. We may factor g{°' 1 } +1 (z) as a product of two elements, namely 

g^ k+ \~z) = (id { ^\(d hk+1 g)^ k (z)) ■ (g^ k (z),g^ k (z)). (6.2) 

By the inductive hypothesis we have g^ ' 1 ^ (z) G HK fc (G.). The derivative g : H — >■ 

G is a polynomial map with coefficients in the prefiltration G, defined by G i := Gj+i 
(note that this is a prefiltration, since 



[G{, G j] — [Gi+i, Gj+i] c Gi+j+2 Q Gi+j+i — Gi + j t 



By a second application of the inductive hypothesis we therefore have {dh k+1 g)^ 0,1 ^ k (z) G 



WK k (G,). In view of (16. 2p it therefore suffices to show the inclusions 

HK fc (G.) A C HK fe+1 (G.) 
(where HK fc (G.) A is the diagonal subgroup {(t,t) : t G HK fc (G.)}) and 

idf 1]k x HK fc (G~.) C BK k+ \G.). 

To check the first inclusion it suffices to check elements (7 F ,7 F ) where 7 G G co dim(F)- 
But it is easy to see that (7 F ,7 F ) = 7 F inside G^ ' 1 ^ +1 , where the codimension of the 
face .F inside {0, equals codim(F), and the inclusion follows. To check the second 
inclusion it suffices to check elements (id^' 1 ^ ,7^) where 7 G G C odim(F) = G cot n m (F)+i- 
But it is again easy to see that (id^ 1 ' 1 ^ , r ) F ) = 7^, where now the codimension of 
F inside {0, l} k+1 is codim(F) + 1. This concludes the proof of the only if part of 
Proposition 16. 5( the perceptive reader will have noticed that we have not yet made any 
essential use of the main property of prefiltrations, namely the nesting property that 
[Gi, Gj] C Gi + j. 

We turn now to the proof of the if direction of the proposition. We are to show 
that if g^°^ h maps #M to HK fc (G.) for all k, then g G poly (#,<?.). Pick an element 
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z = (^^)we{o,i} fe i n H^. By Lemma EH (which does use the nesting property of G m ) 
we may write 




One may check that the rji enjoy the following support properties: (r]i) u = idc unless 
oj i+ i, . . . , u k are all nonzero, and (r]i) w = (rji)^ if u,cu' differ only in the u;, coordi- 
nate. One may now examine ( 16. 3 p coordinatewise, peeling off T)k,i]k-i, ... in turn, to 
eventually conclude that 



7o = d hl . ..d hk g(x). 

Now we know that 70 G Gcodim(F ) = Gk, and thus we have proved that ■ ■ ■ dh k g takes 



Polynomial maps in coordinates. From now on we specialise to the case of 
polynomial maps from Z* to G and revert to dealing with nitrations as opposed to 
prefiltrations. Our aim in this section is to describe the elements of poly(Z*, G 9 ) using 
the Mal'cev coordinate map ip : G — > W 71 relative to some Mal'cev basis X for G/Y 
adapted to the filtration G m . 

Definition 6.6 (Multi-binomial coefficients). Let t ^ 1 be an integer. Suppose that 
n = (ni, . . . , n t ) and that j = (ji, . . . ,j t ) G Zt, is a set of indices. Then we write 



A version of the following lemma may be found in [2^| §4]. 

Lemma 6.7 (Description of poly(Z*,G.) in bases). Suppose that G/Y is a nilmanifold 
of dimension m and that X is a Mal'cev basis for G/Y adapted to some filtration G 9 . 
Then g G poly(Z*, G 9 ) if and only if the coordinates if)(g(n)) have the form 



Remark. The presence of the discrete subgroup Y is not at all relevant to this lemma; 
however we have only defined Mal'cev bases in this context. 

Proof. We start with the if direction. If g(n) has the form stated then it is a product 
of sequences of the form n 1— > aw, where a G Ga\. By the group property of poly(Z*, G m ) 
it therefore suffices to establish the result in the case that g(n) is actually equal to such 
a sequence. By induction one sees that the derivative du x ■ ■ ■ dh k g(n) equals a p ( hl >-> hk > n \ 
where the maximal degree ctx + ■ ■ ■ + at of a monomial n" 1 . . . n^ 1 appearing in p is at 
most max(|j| — k, 0). Thus we see that this derivative lies in Ga\ if k ^ and is zero 
otherwise. It follows that g G poly(Z*, G m ). 



values in Gk, as required. 



□ 





where eachtj lies inM. m and is such that (tj)i = 0ifi^ m-mij, where \j\ := ■ -+jt- 
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To prove the only if direction, let f)j C Q be the subspace 

t)j := Span(X j+ i, . . .,X m ) 

and set Hj := exp(f) 3 ). By the nesting property of the Mal'cev basis X (see (lA.ip ) we 
see that Hj < G. 

Suppose as a hypothesis for downward induction on k that the statement has been 
proved for all g G poly(Z 4 , G 9 ) with g(n) G H^ for all n, for a certain value of k. This 
is trivial for k = m, in which case g(n) = idg. Suppose that g(n) G Hk_i for all n. Let 
7i : i/fe-i —7- Hk_i/Hk = M. be the natural projection. Then Pk-i{n) := 7r(g(fl)r) is a 
polynomial map from R* to R. Suppose that fc— 1 < m-mj, and that i is minimal subject 
to this property. Then for any hi, . . . , hi G Z* we have . . . <9/j 4 g G G; = H m _ mi , and 
therefore . . . d^Pk-iin) = 0. Thus the total degree of any monomial in pk-i is at 
most i — 1. Therefore we may write the sequence /i(ri) defined by 

:= exp(X fc _ 1 ) Pfc ~ l( " ) 

as a product of sequences exp(Xfc_i) with \j\ ^ i — 1. By the minimality of i we 
have Xk-i G 0i-i, and so each of these sequences lies in poly(Z*, G,), and hence so does 
h. It follows that the sequence g(n) := (7(n)/i(n) _1 lies in poly(Z*, G,). But this new 
sequence g has g{n) G i?fc, and hence we may proceed by induction. □ 

A useful and easily-derived corollary of Lemma IfTTl is that poly(Z', G,) is closed under 
dilations. 

Corollary 6.8 (Dilation of polynomial sequences). Suppose that g G poly(Z*,G.) and 

that a±, . . . , a t , b±, . . . ,bt G Z. Then the sequence n \-t g(ai + bxni, . . . , a t + b t n t ) also lies 
in G,. □ 

We remarked in the introduction that a sequence g : Z — > G is polynomial with 
coefficients in some filtration G, if and only if g has the form 

g(n) = a p 1 l{n) ...a p k k{n) (6.4) 

for polynomials pi, . . . ,pk with integer coefficients. Although this result is not required 
in the paper it is certainly conceivable that one might wish to apply the main theorems 
of the paper to a sequence which is presented in an explicit form such as (16.41) , and does 
not obviously satisfy the more abstract condition of Definition 11.81 

The fact that every polynomial sequence has the form (16.41) is an easy consequence of 
Lemma 16.71 To establish the converse, consider first the lower central series filtration 
G, which has degree s, the step of the nilpotent Lie group G. Let d be the maximum 
degree occurring amongst the polynomials Pi and define a finer filtration G' 9 of degree 
sd by setting G[ := Gui^. This is a filtration since 

[G'^G'j] = [Gfj/rf], Gfj/tf]] C Gii/^+u/d-] C G^y^ = G' i+j . 

Any sequence of the form n i— > a\*' , j ^ d has coefficients in G' 9 since G\ = G for 
i = 0,1, ... ,d and the (d + l)st derivative of such a sequence is trivial. Since g is a 
product of such sequences and poly(Z, G' 9 ) is a group we see that g G poly(Z, G' 9 ). 

We note that if G/T has a Q-rational Mal'cev basis adapted to the lower central series 
then, by the results of the appendix, there is a Q^^^-rational Mal'cev basis for G/T 
adapted to G' 9 . 
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We leave it to the reader to formulate and prove an analogous result for polynomial 
mappings from Z* to G. 



We are now in a position to attack the general case of Theorem l2.9l Our analysis of the 
Heisenberg example in £0 suggested that the argument will involve an induction on the 
degree d of G,. In that case there were two different scenarios in which we reduced from 
the case d = 2 to the case d — 1. Whilst the same is true in general, the introduction 
of genuinely polynomial sequences (rather than just linear ones) necessitates a further 
inductive loop on the quantity m* := m a b — mim, which we call the nonlinearity degree. 
To see why, consider the following slightly informal example. 



Example. Let G/Y be the Heisenberg example, and let g(n) = ( Q l a% \ , where 
a±,a2 and «3 are highly independent over Q. Then there is no horizontal character n 
of low frequency such that \\r] o g\\c°°[N] is small. 

Now we have dg = g and d l g = idc for % ^ 2, and so g has coefficients in the 
subgroup sequence G, defined by G( ) := G(i) := G@) := G, Gm := G^ := G 2 , and 
G(f) := {idc} for i ^ 5. With this choice we have G D = G xG. However g^ takes values 
in G x G2 G, and hence rj n o g^ = for any horizontal character rp with frequency of 
the form (a, b, —a, —b) G Z 4 . Thus, a lack of uniform distribution for g^ does not imply 
lack of uniform distribution for g. 

The problem in the above example is that the filtration G, was far too "coarse" to 
accurately capture the differential structure of the sequence g. Indeed g also takes values 
in the minimal (lower central series) filtration, as we saw in £j5j 

In the light of the above example we can expect that it will sometimes be necessary 
to pass to a "finer" filtration of the same degree d, in order to properly capture the dif- 
ferential structure of g. This finer filtration will have a smaller value of the nonlinearity 
degree m*, and thus we introduce an extra inductive loop to incorporate this parameter. 
To be precise we shall prove, by induction on d and m*, the following slight variant of 
Theorem 12.91 

Theorem 7.1 (Variant of Main Theorem). Let m, d ^ be integers with m* ^ m. Let 
< 5 < 1/2 and suppose that N ^ 1. Suppose that G/Y is a nilmanifold and that G, is 
a filtration of degree d and with nonlinearity degree . Suppose that X is a\j 5-rational 
MaVcev basis adapted to G, and suppose that g G poly(Z, G,). If (g(n)Y) n€ [ N ^ is not 
5-equidistributed then there is a horizontal character n with < \n\ S^ ™,™*,^ 1 ) such 
that 



It is clear that this does imply Theorem I2.9[ since the dependence of the 0(1) expo- 
nents on m* may be suppressed once Theorem 17.11 has been proven by induction. In 
our proof there will be an outer inductive loop over d and an inner one over m*. In 
other words we shall assume that Theorem 17.11 holds for all pairs (d', m'J in which either 
d' < d or for which d' = d and m'^ < m*, and deduce the case (d, m*). 



7. The general case of the main theorem 




V ° #||c°°[jv] < $ 
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Henceforth we allow all constants implicit in the <C or O-notation to depend on d, m 
and m*. 

We begin with some simple reductions. By Lemma \3. 71 we may assume that the orbit 
(g(n)r) ng [jv] is not tf^^-equidistributed along some vertical frequency £ G Z m<J with 
|£| <C 5"°^'. Thus there is some function F : G/T — > C with ||-F||Li P ^ 1 and vertical 
frequency £ such that 

\E ne[N] F(g(n)T) - [ F\ > <5°M. (7.1) 
./G/r 

If £ = then F is G^-invariant and we may descend to Gj Gd, together with the filtration 
G,/G(i which has length d — 1, and invoke our inductive hypothesis. We pause to give 
the rather straightforward details. 

Write G := G/Gd and V := T/(r fl Go)- Then G/r is a nilmanifold togther with a 
filtration G, of length d — 1, where := Gi/Gd- The Mal'cev basis = {Xi, . . . , X m } 
may be reduced to give a ^-rational Mal'cev basis X = {Xi, . . . , X m } for G/r adapted 
to G., where m := m — rrid- 

Write g : Z — ?■ G for the reduction of g(mod Gd) By the G^-invariance the function 
F descends to a Lipschitz function F : G/r — > C with H-FHup ^ ll-^llup, and so (17.11) 
implies that 



E ne[N] F(g(n)T) - /_ _F 
G/r 



>S\\F\ 



Lip- 



(Here we have used the fact that normalised Haar measure on G/T is be obtained by 
quotienting that on G/T by Gd-) 

We may now apply the inductive hypothesis to obtain a horizontal character rj : G — > 
C on G of frequency magnitude < \rj\ <C <5~ 0(1 ) such that 

11*7° </||c<»[jv] < 

If we let 7] : G — > C be the horizontal character on G defined by r](x) = rj{x) then we 
have fj o ~g = rj o (7 and |r/| = |r/|. This concludes the proof in the case £ = 0. 

Suppose henceforth that £ 7^ 0. Since F has £ as a vertical frequency, (17.1 j) becomes 

|E ne[ ^F(^Hr)|»5°( 1 ). (7.2) 

We proceed initially with two additional reductions. The first is to the case g(0) = idc- 
Factorize g(0) = {g(0)}[g(0)] as in Lemma EH Set g(n) := {g{$)Y l g{n)g(<S)- l {g(Q)}. 
Then we have |E ne[JV] F(<?(n)r)| ^ 5, where F(x) := F({g(0)}x). But F still has 
vertical oscillation £ and, by Lemma |A.5[ it has Lipschitz constant 0(1). Noting that 
\\v 9\\c°°[N] = \\v <7||c°°[Ar] we see that if we have Theorem 17. II for g then we also have 
it for g. 

The second reduction is to the case when |-?/>(g(l))| ^ 1 (this is needed in the lead 
up to ( I7.16P ). To do this, factorize g(l) = {g(l)}[g(l)] as in Lemma IA.14I Set g(n) := 
g(n)[g(l)}- n . Then g(n)T = g(n)T, g(0) = id G , g G poly(Z, G.) and 7t(g(n)T) = 
ir(g(n)T), so proving Theorem 17.11 for g is equivalent to proving it for g. 

Henceforth we assume g(0) = idc and |-0(g(l))| ^ 1. 
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As in §3 we apply Van der Corput's Lemma (Corollary 14. 2p to (17. 2p to deduce that 
for ^> 5°^N values of h, we have 

\E ne[N] F(g(n + h)T)F(g(n)T)\ » <5°«. (7.3) 

For each fixed h this may be interpreted as a statement about the polynomial sequence 
(g(n + h),g(n)) on the product group G 2 . However, guided by our experience with the 
Heisenberg group, it is natural to try and interpret it as a sequence on a somewhat 
smaller group. To this end, we define the nonlinear part g 2 of g by 

g 2 (n):=g(n)g(l)~ n . (7.4) 

Motivated by what we did in §0 we may then rewrite (I7.3P in the form 

\E ne[N] F h (~g h (n)T 2 )\ » 8°™, (7.5) 

where 

F h (x,y):=F({g(l) h }x)F^) 

and 

g h (n) := {{g(l) h y l g 2 (n + h)g{lY{g{l) h }, g 2 (n)g(ir). (7.6) 

It turns out that g h takes values in G n := G x Ga G, just as we found in our analysis 
of the Heisenberg case. To prove this note that have G 2 ^ [G, G], and so G becomes 
abelian after quotienting out by the normal subgroup G 2 . Thus we need only prove that 
g2^n) G G 2 for all n. We have d 2 g{n) = idc modulo G 2 . Since g(0) = idc, this implies 
by an easy induction that g{n) = g{l) n modulo G 2 , and so g 2 does indeed take values 
in G 2 . 

We may therefore replace (17.51) by 

Ke [N] F^(n)T a )\^5 ^ (7.7) 

by restricting everything in that equation to an object on G D . 

Note that, exactly as in the Heisenberg case, F° is invariant under G^ = {(gd,gd) '■ 
9d £ Gd}- Indeed, since Gd is central in G, we have 

F^((g d ,9d) • x D ) = F{{g{l) h }g d x)Fjg~^) 

= <^d))<-^)nw) h }^)W) 

Thus F° descends to a function F° on G D := G n /G^ and we may write (17.71) as 

|En e[ iv]^?(^Hr s )|»5 «, (7.8) 

where r°:=r n /(rnG^). 

The next proposition is central to our whole argument in that it clarifies the sense in 
which G u is "less complex" than G. 

Proposition 7.2 (Reduction in degree). Define (G u )i := Gi x G . +1 d for i — 1, . . . , d. 

Then (G n ), is a filtration on G n of degree d. Since (G n )d = G^, it descends under 
quotienting by G^ to a filtration (G n ). of degree d — 1 on G n . Each polynomial se- 
quence g^ lies in poly(Z, (G D ),) ; and hence each reduced polynomial sequence g° lies in 
poly(Z,(G^).). 
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Proof. We start with a lemma. 

Lemma 7.3. Suppose that H\,H2 and K\,K2 are normal subgroups of a group G, 
that Hi,H2 generate a group H and that K\,K2 generate a group K. Then [H,K] is 
generated by the groups [Hi,Kj], 1 ^ i, j ^2. 

Proof. The groups [Hi, Kj] are all normal, and thus the group they generate is also 
normal. If we quotient by that group, then Hi,H 2 commute with Ki,K 2 , and thus H 
commutes with K. The claim follows. □ 

Now observe that (G n )j is generated by G 2 +1 and G i . In view of the lemma it 
therefore suffices to establish that all four of the quantities 

iv^A /-tAi lyA ril 1 \/ r i2 /^Ai \s~i2 ry2 1 

lie in Gf + j. Using the fact that G, is a filtration, the first quantity is manifestly contained 

in Gf^j and the last three lie in Gj + j +1 . It follows immediately that (G n ). is indeed a 
filtration. 

Next we show that G poly(Z, (G D ).). Here we make serious use of the fact that 
poly(Z, G,) is a group for the first time. Recall that 

9k(n) := {{g{l) h r l g2{n + h)g(l) n {g(l) h }, ^,(n)p(l)») . (7.9) 

Now poly(Z, (G D ).) is a group, and it is also closed under conjugation by elements of 
G 2 . Since (g(l) n , g(l) n ) is obviously in poly(Z, (G D ).), it suffices to check that (g2(n + 
h),g2{n)) G poly(Z, (G D ).). Of course, gi G poly(Z, 0.) and hence, by Lemma loTTl it is 

a product of elements g\ with G Gj. It therefore suffices to show that (g> ' ,g\ ) G 

tn+h\ I n \ 

(G D ).. Taking jih derivatives, it suffices to check that g}'~ = g\ ■ (mod Gj+i). For 
j < % this follows from the fact that g^ G Gj, whilst for j ' ^ z it is trivial. □ 

In order to apply the inductive hypothesis, we must specify a Mal'cev basis X u for 
G n /T n adapted to the sequence (G 111 )., and it must then be checked that F° is Lipschitz 
with respect to the metric d-^. These are rather tedious matters and we recommend 
that the reader take the following lemma on trust on a first reading of the paper. 

Lemma 7.4 (Rationality bounds for the relative square). There is an O '(tf-oW) -rational 
Mal'cev basis X u = {X?,...,X° n } for G°/T n adapted to the filtration (G n ). with 
the property that ip x n{x,x') is a polynomial of degree 0(1) with rational coefficients of 
height S~°^ in the coordinates ip(x),ip(x'). With respect to the metric d x u we have 
II Fh II Lip ^ 5~°^ uniformly in h. 

Proof. We consider G u as a subgroup of G x G. Recall (cf. Definition |A.7[) the 
definition of a weak basis. It is clear that Xx X = {(X\, 0), (0, Xi), . . . , (X m , 0), (0, X m )} 
is a 5~ 0< - 1 - ) -rational weak basis for G/T x G/Y and that each of the groups (G n )j := 
Gi x Gi+1 Gi is -rational with respect to this basis. By Proposition lA.lOl it follows 
that there is a Mal'cev basis X D = {X^, . . . , X° n } for G D /T D , adapted to the filtration 
(G D )., with the property that each X? is a (-^-rational combination of the elements 
of X x X. By adding the elements (X 1; 0), . . . , (X miia , 0) to X n we obtain a weak basis 
3^ for G/T x G/T which enjoys the nesting property ( lA.ip . From Lemma [A. 21 it follows 
that each coordinate of ipy(x,x') is a polynomial of degree 0(1) and with coefficients 



THE QUANTITATIVE BEHAVIOUR OF POLYNOMIAL ORBITS ON NILMANIFOLDS 



37 



5 in the coordinates ip Xx x{x, x'). Restricting to those pairs (x, x') which lie in G a , 
we obtain the stated property. 



Recall that F^(x D ) = F({g(l) h }x)F(x'). Now by definition we have \4>x({g(l) h )}\ ^ 
1. By Lemma [A. 51 (and Lemma [A. 141 which guarantees that every x £ G/T has a rep- 
resentative with coordinates bounded by 0(1)) we see that (x,x') F({g(l) h }x)F(x') 
defines a function on GxG whose Lipschitz constant with respect to the product metric 
d x d is <C 8~°^ . Now by Lemma IA.6I and the construction of X u we therefore have 
II -^11 Lip ^ 5~°^ where, remember, the Lipschitz constant is being computed with 
respect to the metric d x n. □ 

Let us now resume the discussion starting from (17.81) . We begin by reprising some of 
the straightforward arguments at the start of the section (where we dealt with the case 
£ = 0). By reducing the first mP := vrP — elements of X u we obtain an 0(5~°^)- 
rational Mal'cev basis = {Xp, . . . , J&T} for tWT° adapted to the filtration (G n ).. 

m_ 

With respect to the metric d-^p we have Hi^Hup ^ 5~°^\ 



Since {G n ) 9 has degree d — 1 our inductive hypothesis is applicable and we conclude 
that for ^> 5°^ values of h G [N] there is some horizontal character rj h : G n — > R/Z 
with < \fj h \ < S~° m ^ and 

ll%°^llc-[iv] <<r°™«. 

By pigeonholing in h we may assume that rj = 1% is independent of h. Writing rj : G u — > 
R/Z for the horizontal character defined by r)(x) = fj(x), we see that < \r/\ <C 5~° m W 
and that 

\\v°9h\\c-[N\ «r° m(1) . (7.10) 

The next lemma, which is almost identical to Lemma I5.2[ allows us to write rj in 
terms of maps defined on G rather than C7 n . 

Lemma 7.5. We have a decomposition r)(g',g) = T]i(g) +ri 2 (g'g~ 1 ) for all (g',g) G G n , 
where r\\ : G — > R/Z is a horizontal character on G, and 7/2 : G 2 — > R/Z is a horizontal 
character on G2 which also annihilates [G, G2]. Furthermore we have \rji\, \i] 2 \ <C 5~°^ . 

Proof. If we define r)i(g) := rj(g, g) and ^2(^2) : = vi.92, id<s) for g G G and g 2 G G 2 then 
the decomposition follows since rj is an additive homomorphism. Since 77 annihilates 
[C7 D ,C7 n ], which contains [G A , G 2 x idg] = [G, G 2 \ x idc, we see that 772 annihilates 
[G, G2}; since rj annihilates r D , which contains both T A and (r n G 2 ) x idc, we see that 
771 and 772 annihilate T and r fl G2 respectively. 

It remains to check the boundedness properties. Writing 

r](x, x') = k u ■ ip x n(x, x'), 

where k D G Z mn , we have by definition that \k D \ <C 5~°( l \ The integer vectors k\ and 
k 2 used to define \rji\ and 1 772 1 are then given by 

k\ ■ ip(x) = r)i(x) = r)(x, x) = k D ■ ip x n(x, x) 

and 

k 2 ■ ip(x) = 7/2(2) = rj(x, id G ) = k D ■ ip x n(x,id G ). 
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That \ki\, \k 2 \ <C 5^°^ now follows immediately from the fact, established in Lemma 
I7.4[ that ip x u (x, x') is a polynomial of degree 0(1) with rational coefficients of height 
0(S~°^) in the coordinates ip(x),^(x'). □ 

Now let us return to fl7.10p . and reinterpret this in terms of the decomposition of 
just given. Recalling the formula (17. 9p for g^(n) we therefore have 

v(9h( n )) = vMn)) + V2({ 9 (l) h }- 1 92(n + %(l)»{s(l) & }0(l)^(n)- 1 ) 
which, since r] 2 vanishes on [G, C( 2 )], is equal to 

Vi(g(n)) + V2 (g 2 (n + ^{^llt^^teflll^ll^W- 1 ) 
=Vi(g(n)) + V2 (g 2 (n + h)) - ruMn)) + V2 ({g(l) h }- 1 g(ir{g(l) h }g(l)- n ). 

Now one easily verifies by induction on n that y~ 1 x n yx~ n = [x, y] n (mod [G, [G, G]]). 
Since r\ 2 annihilates [G,G 2 ], which contains [G, [G, G)], we can therefore simplify the 
above a little further to 

V(9h(n)) = Vi(g(n)) + V2 (g 2 (n + h)) - ^(^(n)) + n V2 ([g(l), {g(l) h }}) 

:= P(n) + Q(n + h) - Q(n) + a(h)n, (7.11) 

where P, Q : Z — y R/Z are polynomial sequences of degree at most d. 

The next lemma is specifically designed to handle the situation that has arisen here. 
In this lemma it is convenient to reprise a notation from earlier papers of ours (such as 
[llj): if a G R/Z and Q > 1 we write ||«||r/z,q : = infi^Q ||<?«||r/z- In a similar spirit, 
for any / : Z — > R/Z define 

||/||c»[jv],q := \\qf\\c°°[N\- 

Lemma 7.6 (Polynomials lemma). Suppose that P, Q : Z — > R/Z are polynomial se- 
quences of degree at most d with P(0) = and Q(0) = dQ(0) = and that a : [N] — >■ 
R/Z is an arbitrary map. Suppose that there are 5°^'N values of h G [N] such that 

\\P(n) + Q(n + h)- Q{n) + a(h)n\\ c ^[N] < 
T/ien II^Qllm/z^-oci) < S^^/N* for 1 > 3, and 

IIP(i) + «/, + || M/Z)5 _ 0(1) « r°w/iv 

/or 3> 5°^N values of h G [iV] ; where 

a:=d 2 Q(0). (7.12) 

Proof. The assumption implies, looking at the second derivative at n = 0, that 

\\d 2 (p - q)(o) + 9 2 g(/i)|| M/z « r°«/iv 2 

for ^> 6°^N values of h G [AT]. Applying Lemma [4.51 then implies that 

\\d\p - q)(o) + d 2 Q|| CoW -o ( i) « r°«/iv 2 . 

Thus, as stated, we have 

\\d l Q\\ w -o W « S-OW/N* 
for z ^ 3, which means in view of the Taylor expansion of Q that we can write 

Q(n) = a( n \ +R(n), 
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where R(0) = R(l) = R{2) = and ||i?|| c°°[N],8-°w "C 5 ot ~ l \ Substituting back into 
our assumption yields that 

P{n) + {ah + a(h))n + R(n + h) - R(h) + a ( k ) < <T° (1) 

for ^> S°^N values of h G [N]. Differentiating at zero and recalling that -P(O) = we 
obtain 

||P(1) + a{h) + ah + dR{h) || R/Z « 8-°V/N, 
which implies in view of the properties of R that 

||P(1) + a{h) + ah\\ w - OW « 5-°^/N. 

This completes the proof. □ 

Now let us recall ( ITTiTj) . We know that \\r] o #^||c°°[iv] < for > 5° W N values 

of h, so let us apply the lemma with P := r]i o g, 

Q:=r)2°92, (7.13) 
and <r(/i) := ^([^(l), {g(l)} h ]). By pigeonholing in h we see that there is some q ^ 

hvMi)) + qm([g(i), {g(i) h }]) + qah\\ R/z « 6-°W/N. 

By redefining rji and 772 (none of the boundedness properties of Lemma 17.111 are lost by 
doing this) we may write this as 

\\ Vl (g(l)) + r) 2 ([g(l), {g(l) h }]) + qah\\ m « 5~°^/N. (7.14) 

We now proceed as in §|5j using Mal'cev bases to work with explicit bracket polyno- 
mials. 

Since r\ 2 annihilates [G, [G, G)] C [G,G 2 ], we see that the map x (-)■ r} 2 ([g(l), x)) is a 
homomorphism. Thus there exists ( G W 11 such that 

r? 2 ([#(l),a;]) = C-V^)(modZ) (7.15) 

for all x G G. Since r] 2 annihilates [G, G 2 ], all but the first m\\ n coordinates of ( are 
zero. Since we have reduced to the case |^(^(1))| ^ 1 and the basis X is ^-rational it 
follows that |C| < 5-°^. 

We now define (3 := rji(g(l)) and 7 := ip(g{l)). Now since [G,G] C G 2 the map 
V'lin : G ~~ ^ IR miin which picks out the first m\\ n Mal'cev coordinates is a homomorphism, 
and therefore the first m\\ n coordinates of ip(g(l) h ) are just 7/1. We may now rewrite 
CLU as 

+ qah + C ■ {tMIIk/z < <T° (1) /iV (7.16) 
for > 5°M/V values of h G [TV]. 

This assumption is the same as in Proposition 15.31 except that we do not have a 
bound on \qa\. However, we have 

Claim 7.7. At least one of the following statements holds: 

(i) There is r 5~ 0( -^ such that ||r£j(mod Z)|| K / Z <C 5~°^/N fori = 1, . . . ,m^ n ; 

(ii) There exists k G Z m ^, < |Jfe| < such that \\k ■ j\\ R/z < 5~°^/N. 
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Proof. We apply Proposition 15.31 with (' : = (£, 1) £ M miin x E, 7' := (7, get) £ 
M miin x R and a' := 0, deducing that either \Q\ < 5^° {l) /N for all i = 1, . . . , m Un (in 
which case (i) holds) or else there exist k £ Z miin and r £ Z, not both zero and with 
\k\, \r\ < such that ||fc ■ 7 + gra|| R / z < 6~°^/N. If r = then (ii) holds, so 

assume that r ^ 0. Multiplying (" 17. 1 6[) through by r we see that for ^ 5N values of 
h £ [N] we have 

\\p + ah + {- {jh}\\ m «r°«/iv, 

where /3 := r/3, a := {/c • 7 + grct} satisfies \a\ ^ 8~°( 1 '/N and C := r C ~ Thus 
we may apply Proposition 15.31 once more to conclude that either \Q\ <C 5~°^/N for 
% = 1, . . . , miin, which implies (i), or else there is a nonzero £ Z miin such that \\k ■ 
tIIr/z ^ 5~°^/N, which implies (ii). This establishes the claim. □ 

If Claim fTTTT ii) holds then consider the map rj : G — > R/Z defined by 

r](x) := k ■ ^(a;)(mod Z). 

Since k £ Z miin , 77 is a horizontal character and we have \rj\ = \k\ <C 5~°^\ Finally we 
have 

f] o g(n) = r](g(l) n ) = nk ■ 7(mod Z), 

and so \\r) o <7 1| 00 [a^] J -0 ^. This completes the proof of Theorem 17. II in this case. 

Suppose then Claim l777T i) of the claim holds. For each i — 1, . . . , m consider the map 
n : G ->• R/Z defined by 

Ti(x) : = rj/a([z,exp(Xi)]). 

Since [r, r] C T and [G, G] C G 2 we see from the properties established in Lemma 17.51 
that Tj is a horizontal character which annihilates G 2 - It is not hard to establish that 
|Tj| «C <5~ 0( - 1 * ) . To do this we write (as usual) 

Ti(x) = ki ■ ■0(x)(mod Z), 

where ki £ Z m (and in fact ki £ Z miin since 7* annihilates G%). From the definition of 
Tj, the bound r <C 5 -0 W, the ^-rationality of the basis X and Lemma [A. 31 we have 

(h)j = n(exp(Xj)) = rr ?2 ([exp(X J ),exp(X J )]) < <T° (1) , 
and so indeed |r,| = |^| <C Now we have 

n o gr(n) = nr]i(g(l)) = rnCi(mod Z) 
where the last equality follows from (17.15p . By property (i), this implies that 

\\n o g>||c'°°[iv] < 

and so once again we have proved Theorem 17.11 unless r, = for all i — 1, . . . , m. 

So far we have been successful in deducing Theorem 17. II by induction on the degree d, 
but we know from the example at the start of this section that it is not always possible 
to make such a deduction as G, may be "reducible" for g. It turns out that the case 
we have not yet covered corresponds to this situation. 

Suppose then that 7$ = for all i, so that i]2([x, exp(Xj)]) = for all x £ G and all 
i £ [m]. Since the homomorphism r\ 2 annihilates [G, [G,G]] C [G, G 2 ], we see using the 
identity = [x, z)^" 1 , [x,y]][x,y] that the map y \-t r] 2 ([x,y]) is a homomorphism 
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for any fixed x. It follows that r] 2 ([x, y]) = for all x,y G G, or in other words that r\ 
annihilates Thus ( = (cf. (I7.15P ) and ( I7.16P degenerates to 

\\f3 + qah\\ m <^5-°^/N 

for ^> S°^N values of h G [N]. By Lemma [3.21 this implies that 

l|a|| w -o ( i) <r°«/iv 2 , 

and thus by (1712]) 

\\d 2 Q\\ m ,s-o W ^S-°^/N 2 . 
where Q was defined in (17.131) . We have Q(0) = Q(l) = and, by Lemma I7.6j 
W&Qhfas-ow < S-OW/N* for i ^ 3. Thus 

||^2 O g2\\c°°[N],8-°<» < 

Thus there exists q, 1 ^ q ^ 5~°^\ such that 

11^2 °g2\\c°°[N] < <5~° {1) . 
For notational simplicity we rename g77 2 as 7/2, thus 

||%o(7 2 ||c o [ ^ ] «r W. (7.17) 

Roughly speaking, this statement means that g exhibits some essentially linear be- 
haviour (in the "direction" orthogonal to 772) inside G 2 . For our purposes this means 
that G 2 was too large to accurately capture the quadratic and higher order terms of g, 
and we must pass to a finer filtration G' m which does not have this drawback. This is 
the point in the proof where we induct on the nonlinearity degree m*. 

Now 7] 2 : G2 — > R/Z has the form 

V2( x ) = k ■ T/>(x)(mod Z), 

where k G Z" 12 C Z m satisfies \k\ < In the ensuing; discussion we will also need 

the lift rj 2 : G 2 -> K defined by 

Now the map # : G 2 x G 2 — >■ M defined by 6*(x, £/) := f\ 2 [xy) — fj 2 (x) —fj^iv) is continuous, 
Z- valued and vanishes when x = y = ida- Since G 2 x G 2 is connected it follows that 
8 = identically, and hence the lift fj 2 is a homomorphism. 

Lemma 7.8 (A finer subgroup sequence). Define G' = G[ = G and G\ = Gi D kerf} 2 
for i ^ 2. T/ien G". = (G-)^ a filtration with degree at most d and nonlinearity 
degree m!^ ^ — 1. Each G\ is closed, connected and 5~°^ -rational (with respect to 
our Mal'cev basis X on G/Y adapted to G,). 

Proof. Let tt : G 2 — > G 2 /[G 2 , G 2 ] be the natural projection. It follows from the 
Baker-Campbell-Hausdorff formula exp(X) exp(F) = exp(X + Y + \[X, Y] + . . .) that 
7r o exp : g 2 — » G 2 /[G 2 , G 2 ] is a linear map. Since fj 2 : G 2 — > K factors through 
G2HG2, G 2 ] it follows that 7/ 2 oexp : g 2 — ► K is also a linear map. For i = m lin + 1, . . . , m 
we have fj 2 o exp(Xj) = ki, an integer of magnitude 0(S~°^). Thus by simple linear 
algebra we see that each Lie algebra q[ = Qi D ker(fj 2 o exp) is spanned by 0(5~°^)- 
rational combinations of the JQ. Thus the G[ are 0(5 _ ° ( ' 1 - ) )-rational closed connected 
subgroups as claimed. 
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If i,j ^2 then it is clear that Gj] C G' i+ j since r] 2 : G2 — ► ^ is a homomorphism. 
We must also check that C G' i+1 for i ^ 2, which follows from the fact that 

[G, Gi] C [G, G2] Q ker?72. The statement about m'^ is immediate from the fact that 77 
is nontrivial, and it is obvious that the degree of G' 9 is at most d. □ 

We now come to the main result of this section, which allows us to pass to a new 
sequence g' G poly(Z, G' 9 ) with smaller nonlinearity degree than g. 

Lemma 7.9 (Factorization lemma). Suppose that (I7.17P holds. Then we may factor 
g = eg'^j, where 

(i) e G poly(Z,G.), e(0) = id G; e is {S~ ^\ N) -smooth (cf. Definition 

and \\r] o ellcoowi <C 5~°^ for all horizontal characters r\ : G — > R/Z with 
< ||77|| < r «; 

(ii) </Gpoly(Z,G'.); 

(hi) 7 G poly(Z, G m ) and j(n)T is periodic with period Q <C 5^°^. 

We remark that this lemma is strikingly similar in form to Proposition 19.21 below. 
The proof of the latter result will, in fact, be closely modelled on the proof of this one, 
but will be rather easier. 



Proof. By Lemma [6.71 and the fact that #2(0) = #2(1) = idG we have 



where U G M m and the coordinate (ti)j is equal to if j ^ m — mi. Thus 

f)2°g2{n) = ^k- u( n ) 

i=2 ^ ' 

From (I7.17P we thus have 

\\k-t t \\ m <^5-°^/N\ 

i = 2,...,d. Since \k\ <C 5~°^ we may choose vectors Ui G M m with (ui)j = if 
j ^ m — mi such that \ti — Ui\ <C 5^°^ /N l and k • G Z for i — 2, . . . , d. 

We may now pick vectors Vi in W n with (vi)j = if j ^ m — mi, all of whose 
coordinates are rationals over some denominator q <C 5^°^, such that k ■ Ui = k ■ Vi for 
i = 2, . . . , d. 

Define sequences e, 7 : Z — > G by 

^(e(n)):=X)Q(ti-«i) and ^(tH) := £ Q V< , (7.18) 

and set 

g'{n) : = £(n)"~ 1 ^(n)7(n) -1 . 

Observe from Lemma [6.71 that £,7 lie in poly(Z, £?.) and take values in Gm. We verify 
the properties of e, g' and 7 in turn. 

That e(0) = idc is obvious. To see that e is (5~° ( - 1 \ A r )-smooth we must confirm that 
d(e(n),e(n— 1)) <C 8~°^' /N for all n G [iV]. Now as a fairly immediate consequence of 
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the definition of e we have that 

\ip(e{n)) - ^(e(n - 1))| < cT 0(1) /iV 

and 

|V(e(n))| <<T° (1) 

for all n G [N] . The smoothness therefore follows from Lemma IA.4I Finally we must 
establish the statement about g o e, where g : G — ?■ R/Z is a horizontal character. It is 
clear that any horizontal character rj : G — > R/Z is represented in coordinates as 

rj(g) = k ■ ip(g)(mod Z), 

where ki = ^(exp(X-)) and so in particular \k\ <C 5~°^ if \\r)\\ <C 5^°^. It follows 
immediately from the definition of e that \\g o e||c°°[Af] "C 5~ olyl \ as required. 

Next we show that g' G poly(Z, G' a ). Now we have 

g\n) = e-'i^ginMn)- 1 = e^g^n^ny 1 ■ g(l) n • [^(1)^, 7 (n)]. 

The first derivative of the sequence n \— > g(l) n is g(l) and all higher derivatives are 
just id<2, so this sequence has coefficients in any subgroup sequence. Also the sequence 
[g(l)~ n , 7(n)] lies in poly(Z, G' 9 ) since it is in poly(Z, G m ) and takes values in [G,G 2 ], 
which is annihilated by rj. 

By the group property of poly(Z, G' m ) it therefore suffices to check that e~ 1 g 2 ^r 1 G 
poly(Z, G".). Since this sequence lies in poly(Z, G,), we need only check that it is 
annihilated by rj, that is to say that 

-7/(7(71)) - r]{e{n)) + r}(g 2 (n)) = 0. 

Computing using coordinates we see that the left-hand side here is 

d 

■ (-Vi + Ui-ti + U) 

i=2 

which does indeed vanish by our construction of Ui and v j. 

Finally we must check that ^{njT is periodic. By definition and Lemma [A. Ill we see 
that 7 is 5 _0 ^ 1 )-rational (cf. Definition II . 1 7[) . and then the result follows instantly from 
Lemma lA~T2l (ii). □ 

We will shortly be completing the proof of Theorem 17. II in the case that (I7.17P holds, 
which is the only case left to handle. We isolate a technical lemma which allows us to 
deduce C°° [A^]-properties of polynomials p(n) from properties of p(an + b). 

Lemma 7.10 (Single-parameter extrapolation). Suppose that Q,N ^ 1 are integers 
and a,b are rationals with height at most Q such that b ^ 0. Let p : Z — > IR/Z be a 
polynomial sequence of degree d and write p(n) := p(a + bn) . Then there is some q G Z ; 
1 ^ |<?| <d Q° d{1) , such that 

\\qp\\c°°{N] <d <5° d(1) \\p || &°[Nl- 

We will defer the proof of this lemma to the next section, in which we prove a more 
general multiparameter version of it (see Lemma 18.41) . 

Recall now that in our efforts to prove Theorem 17.11 by induction we had reduced 
to the following situation: g : Z — > G is a polynomial sequence with g(0) = idc and 
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^(^(l))! ^ 1, and there is a function F : G/Y — > C with nontrivial vertical oscillation 
£ and H-FHup ^ 1 such that 

\E ne[N] F(g{n)Y)\>5. 

Furthermore we reduced to the case when g is "reducible" in the sense that (17.1 7p holds. 
This allows us to factor g as in Lemma 17. 9\ obtaining 

\E ne[m F(e(n)g'(n)j(n)Y)\ > 5. 

Choose aQ< 5~°^ such that j(n)Y is periodic with period Q, and split [N] up into 
progressions of length between N' and 2N', where N' := [^iVj , and common difference 
Q. By the pigeonhole principle, there is some such progression {n + nQ : n G [N']} 
such that 

\E nem F (e(no + nQ)g'{n + nQ){ 7 (n )}r) | ^ 5/2. 

Now since e is (S~°^\ iV)-smooth we see, using the right-invariance of d, that if C is 
sufficiently small then 

\E nEm F(e(n )g'(n + nQ){j(n )}Y)\ >8/A. (7.19) 

Now g' G poly(Z, G' 9 ) and hence, by Lemma I6\"5| the sequence 

g( n ) '■= {g( n o)Y l t( n o)g\ n o + n Q){i(. n o)} 

is also in poly(Z, G' 9 ). The inequality (17.1 9p may be rewritten as 

\E nem F(g(n)Y)\>6/4, (7.20) 

where F(x) := F({g(no)}x). By Lemma IA.5I we have ||-F||Lip "C Noting that 

g(0) = idc, we may thus apply the inductive hypothesis that Theorem 17.11 holds with 
parameters (d,m* — 1), deducing that there is some horizontal character fj with < 
\\fj\\ < <5~° (1) such that 

\\fj ° g\\c~iN) < 

From Lemma 17.101 and the definition of g it follows that there is a horizontal character 
r] with < \\r]\\ < <5~° (1) , such that 

\\V ° 9"\\o°[N\ < 

where 

</'(n) := {^(no)}- 1 £(no)^(n){ 7 (no)}. 
Since g'(0) = ida, it follows that 

\\v ° g'\\c°°[N] < 

To complete the proof of the result we must, of course, replace g' by g := sg'j. To do 
this, note first that by multiplying i] by an integer of size 0(S~°^) if necessary we in 
fact have 

\\V ° l\\c™[N] = 0, 

since the Mal'cev coordinates ^(7(n)r) are always rationals over some denominator 
<g From the property (i) of Lemma 17.91 we have that \\rj o e\\c^[N] *C 

Putting all this together, we obtain 

\\V ° g\\c^[N] ^ \\V ° ^l|c°°[7V] + ||?7 ° g'\\c°°[N] + \\V ° l\\c^[N] < <5~° (1) , 
completing (at last!) the proof of Theorem 17. 1[ □ 
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Let us remind the reader that, by remarks immediately following the statement of 
Theorem 17.11 we have also completed the proof of Theorem 12.91 

8. The multiparameter Leibman theorem 

We have proved one of our main results, Theorem 12.91 In this section we bootstrap 
this result into a multiparameter version of itself. Strictly speaking, this step is not 
necessary in order to establish any of the results stated in the introduction, however 
the arguments here are not terribly difficult, and will be needed in order to obtain 
multiparameter analogues of the those results. 

Recall from |6] the definition of poly(Z*, G,), the group of polynomial sequences g : 
Z* — > G with coefficients in G,. Recall also the definition of, and notation for, multibi- 
nomial coefficients (j). 

We need an analogue of the smoothness norms C°° [N] in the multiparameter setting. 
To set these up, we introduce the Taylor coefficients of a polynomial map g : Z* — > R/Z. 

Definition 8.1 (Taylor expansion). Suppose that g : Z* — > R/Z is a polynomial map. 
Then we define the Taylor coefficients aj £ R/Z for j £ Z* to be the unique elements 
of R/Z such that 

3 

for all n; it is not difficult to verify the existence and uniqueness of these coefficients, and 
to check that if g has degree at most d then aj = unless \ j\ ^ d, where \ j\ := ji + - ■ -+jt- 

Definition 8.2 (Smoothness norms). Suppose that g : Z* — > M/Z is a polynomial map 
with Taylor expansion 




3 



Then for any t-tuple N = (N%, . . . , N t ) for N u . . . , N t ^ 1 we write [N] := [JVi] x . . . x [N t ] 
and 

IMIc°°[v] ■= supN J \\aj\\ R /z, 

where W := iVf . . . iVf . 

We have the following generalisation of Lemma 12.81 

Lemma 8.3 (Smooth polynomials vary slowly). Let g : Z* — > R/Z be a polynomial 
sequence of degree at most d and suppose that n £ [N] . Then for any i £ [t] we have 

\g(n) - g(n - e<) | < t , d jj-\\g\\ c °°iNY 
where e*j = (0, . . . , 0, 1, 0, . . . , 0) is the i th basis vector ofJ}. 

Proof. From the Taylor expansion and binomial identities we have 
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Thus 

\g(n) - g(n - e,)\ < w \\g\\ c ~ [M] L ^ (j_ gj <M j^h 



N .\\y\\c°°[N\i 



\j\<d 



as required. □ 

We now give a multiparameter version of Lemma 17.101 which implies that lemma as 
the t = 1 special case. 

Lemma 8.4 (Multiparameter extrapolation). Suppose that t,Q,N±, ... ,N t ,d ^ 1 are 
integer parameters and that di,bi G Q ; i = 1, . . . ,t are rationals of height at most Q 
with hi 7^ 0. Let p : Z* — >■ IR/Z 6e a polynomial map of degree at most d and write 
p(n) := p(ai + &ini, . . . , at + Then there is some q G Z, |g| <Cd,t Q ^*^, such that 

Proof. First of all observe that, if a, b G Q are rationals with height at most Q and 
b 7^ 0, we may expand 



where c(a,b, j' , j) is a rational number with height Oj(Q° j ^). Indeed we clearly have 
c(a,b,j,j) = b~i , and we may then compute c(a,b,j — l,j),c(a,b,j — 2, j), ... in turn. 

Multiplying such relations together we obtain a multiparameter version, viz. 

t 



n( ( ' ! -"; ,)/6, )-E^M'.i)(j) 



where j' ^ j means that each component of j' is at most the corresponding component 
of j. 

Applying this allows us to give the Taylor coefficients ctj of p in terms of those of p. 
Indeed we have 

vm = p( Wl ~ ai ni - at ) = yn (fa -? i)/bi \& = T V (?W b ? Ti&- 

and so 

aj = ^2c(a,bJJ')aj,. 

To obtain the lemma, we simply need to take q to be the product of all the denominators 
of the rationals c(a,b,j,j'), which is clearly <Cj* t Q° d - t<yl \ □ 

Definition 8.5 (Multiparameter equidistribution) . Let G/Y be a nilmanifold and let 
5 > 0. An finite sequence {g{n)Y)^ & p in G/Y indexed by a finite non-empty set P is 
5-equidistributed if we have 



J2 F (9(n)Y)- [ F 

neP Jg ' t 



<<Wlii P 
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for all Lipschitz functions F : G/Y — > C. If N = (N\, . . . , N t ), we say that a sequence 
(^(n)r)jj 6 [jv] is totally S-equidistributed if we have 



£ F(,?(n)r) - I F 
'1X...XP, */G/r 



HePiX...xP t 



<S\\F\ 



Lip 



whenever P; are arithmetic progressions in [iVj] of length at least <5iVj for each 1 ^ % ^ t. 



We can now give the multiparameter version of Theorem 12.91 

Theorem 8.6 (Multiparameter quantitative Leibman theorem). Let s,m,t ^ 1 and 
< 5 < 1/2, and let Ni,...,N t ^ 1 and d ^ 1 be integers. Suppose that G/Y is 
an m- dimensional nilmanifold equipped with a | -rational Mal'cev basis X adapted to 
some filtration G, of degree d, and that g G poly(Z*, G m ) . Then either (g(n)Y) H£ ^ is 
5-equidistributed, or else there is some horizontal character n with < \\n\\ <C ft-Od,™,^ 1 ) 
such that 

II'/ ° y\\c°°[N\ ^ 



Proof. We allow all implied constants to depend on d, m and t. Suppose that 
(g(n)Y) H( -^ is not 5-equidistributed. Suppose to begin with that N\ ^ 5~ c . 

A simple averaging argument confirms that, for ^> 5 ^'N2 ■ ■ ■ N t values of (ri2, • • • , n t ) G 
[N 2 x • • ■ x N t ), the polynomial sequence (g n2 ,...,nt( n ))^)ne[N 1 } is not 5°^^-equidistributed, 
where g n2 ,...,n t ( n ) '■= g(n,n 2 , . . . ,n t ). 

For each such tuple (n 2 , . . . , nt), Theorem 12.91 implies that there is some horizontal 
character r/ n2j ... jnt with < \\r)\\ <C such that 

||^°5 , n 2 ,...,n t ||c-[V 1 ] < <T 0(1) . 

By pigeonholing in n and passing to a thinner set of tuples (n 2 , . . . , n t ) we may assume 
that 77n 2 ,...,nt does not depend on (n 2 , . . . , n t ). Writing p := n o g and expanding 

p-= 5^^i( n2 '---' n *)( ni )' 

ii=0 \^/ 

where the are polynomials, we therefore see that 

\\ Pil (n 2 ,...,n t )\\ m <^5- ^/Ni\ (8.1) 

for ^> 5°^ 1 'N 2 . . . Nd values of (n 2 , . . . , n t ), for each i\ = 0, . . . , d. In particular (for each 
zi) there are > <5 0(1) A^ 3 ...N t values of (n 3 , ...,n t ) for which flSTTj) holds for > 5° (1) A^ 2 
values of n 2 . 

Suppose that i\ > 0. Writing 



(n 2 , . . . , n t ) = ^2 Ph,i2 ( n 3, ■ ■ ■ , nt) I ^ j 



and applying Lemma I4.5[ we see that for ^> 5°^N^ . . . N t tuples (713, . . . , n t ) there is 
5^(7x3, . . . ,n t ) -C such that 



| gil (n 3 , . . • , n t )p ilM (n 3l . . . , n t )\\ R/z < 5 ^/N^N, 



12 
2 • 
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Note that the application of Lemma 14.51 is valid because %\ > and Aq ^ b~~ c ] 
this guarantees that the parameter e in that lemma is small enough. Pigeonholing 
in (n.3, . . . ,n t ) and passing to a somewhat smaller set of these tuples we may suppose 
that qi x = q^ns, . . . ,n t ) is constant. 

We now continue in this vein, obtaining successively quantities qi lt i 2t ... t i r <C 6~ ^\ At 
the final stage we obtain 

||g ill ... J ^ 1> ... )i Jk/ Z <<r°«/^ i ...ivr 

or, in our earlier notation, 

H<8PillR/z « 5-°W/N\ (8.2) 

This has been obtained for all i with %\ > on the assumption that Aq ^ 5~ c . By 
switching the indices ii, . . . ,i t if necessary, we may in fact obtain such a q^ whenever 
there is some r with A* r > 5~° . If this is not the case for any r then (18. 2p holds anyway 
for trivial reasons (for any <C 5~ 0( -^). 

Note that by construction the p^ are simply the Taylor coefficients of p. 
Taking q := Yh^i we see ^ na ^ 9 ^ 5^°^ and that 

Wqprh/z^s-^/N 1 

for each index i and thus 

\\qv°9\\ c °°[it] <<^° (1) - 

The theorem follows. □ 



9. A MULTIPARAMETER INITIAL FACTORIZATION THEOREM 



Having just established Theorem 18. 6[ we now use it to obtain an initial factorization 
theorem for multiparameter polynomial sequences. We first give a multiparameter ver- 
sion of Definition I1.18[ the definition of a smooth sequence (the multiparameter version 
of a rational sequence is obvious). 

Definition 9.1 (Multiparameter smooth sequences). Let G/Y be a nilmanifold with 
a Mal'cev basis X . Let (e(n)) ne z* be a multiparameter sequence in G, let M ^ 1 be 
an integer and let A = (Aq,...,A t ) with Aj ^ 1 for all i. We say that (e(n)) ne z t 
is (M, N)-smooth if we have d(e(n), idg) ^ M and d(e(n),e(n — e*j)) ^ M/A^ for all 
n G [A]. 

Here, then, is the main result of this section. 

Proposition 9.2 (Factorization of poorly-distributed polynomial sequences). Let s, m, 
t ^ 1, let < 5 < 1/2, and let Aq, . . . ,N t ^ 1 and d ^ fee integers. Write A : = 
(Ai, . . . , At). Let G/r fee an m- dimensional nilmanifold with a ^-rational Mal'cev basis 
X adapted to a filtration G, of degree d, and suppose that g G poly(Z',G). Suppose 
that (<?(w)r)jj e nvi i s n °t totally 5-equidistributed. Then there is a factorization g = eg lr ), 
where e,g', , ~f G poly(Z',G.) are polynomial sequences with the following properties: 

(i) e : Z* C7 (O^W 1 )), N) -smooth; 
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(ii) g' : Z* — > G' takes values in a connected proper subgroup G' of G which is 
(3(,5- d,m,i(i)) -rational relative to X; 

(iii) 7 : Z* — > G is 5~° d > m > t ( l > -rational. 

Proof. We will allow all implied constants to depend on d, m and t. 

We first reduce to the case g(0) = ida, by factorizing g = {g(0)}g[g(0)] where g is the 
polynomial sequence ~g := {^(0)} _1 ^[5'(0)]~ 1 , for which g(0) = id G . If (fl^T)-^ is 
not totally 5-equidistributed, then one easily verifies using Lemma lA~5l that (g(n)T) He ^ 

is not totally 5-equidistributed for some 5 3> 8°^ l > . Applying the proposition to g, we 
obtain a factorization g = Sg'j. Setting e := {g(0)}e and 7 := 7[g(0)], we certainly have 
g = eg'^j. The sequence 7 is 5~°^ -rational by Lemma I A .111 and (the multiparameter 
version of) Lemma [A. 121 The sequence e is (S~ ^\ iV)-smooth by Lemma [A. 5 1 

Henceforth, then, we assume that g(0) = ida- By hypothesis, we can find progressions 
Pi := {ai + biUi : rii G [N[\} in [iVj] with N- ^ SNi such that the polynomial sequence g : 
Z* — > G defined by g(n) = g(ai + birii, . . . , a t + b t n t ) is such that (g(n)T) He ^^ fails to be 

5-equidistributed, where N' := (N[, . . . ,Nl). by Lemma EH we have g G poly(Z*,G.). 
Applying Theorem 12.91 we conclude the existence of a horizontal character fj : G — > R/Z 
with < ||r7|| < such that 

11*7° <?Hc~[iV'] < <5~° {1) . 

At the expense of worsening the exponent of the 5~° < - 1 ' ) , we may replace [N'] here by 
[iv]. Applying Lemma \SA\ we deduce that there is a horizontal character 77 : G — >• R/Z 
with < \\t)\\ < such that 

h ^llc-[A?]« r ° (1) - (9- 1 ) 

Take G' to be the connected component of ker(?y). Then G' is rather clearly a subgroup 
of G which is 0(5 _ ° ( ' 1 - ) )-rational relative to X. 

Write 

^H) = £%(/)' 

where tj G M. m . By Lemma 16.71 we know that the coordinate (tj)i is equal to if 
i ^ m — mgi- The horizontal character 77 is given in coordinates by 

where < <^° (1) , and (ED tells us that \\k ■ tj\\ R/z < 5~°^/N^ for all J 7^ 0. Since 
<C 5"°^' we may choose vectors uj G M mi such that \tj — uj\ <C 5^°^ /N^ and 
k ■ uj G Z for all j 7^ 0. We then choose vectors t>j G M mi , all of whose coordinates are 

rationals with complexity at most 0(5^°^), such that k ■ uj = k ■ v j for all j 7^ 0. We 
may insist that the uj and vj have the same support properties as the tj, namely that 
(uj)i = (vj)i = if i ^ m - mg. 
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Define polynomial sequences e, 7 : Z* — > G in terms of their Mal'cev coordinates by 
^«n)) = - uj) (jj and ^( 7 (n)) = ^ , 

and 

sf ■= '//" • 

By Lemma ItxTl and the fact that poly(Z',G.) is a group we see that all three of e,g' 
and 7 lie in poly(Z*, G % ). We must check the claims (i), (ii) and (hi). The claim (ii) is 
clear. To prove (i), that is to say that e is (5~° ( - 1 \ AQ-smooth, we need to show that 

d{e{n),e(n-e l )) < S~ ^/Ni 

for n G N. But as a fairly immediate consequence of the definition of e we have the 
bound 

|V(e(n)) - V(e(n - 3)) I « ^°«/^, 
and so the desired bound follows from Lemma IA.4I Finally we note that (iii) follows 
immediately from the definition of 7 and the properties of rational points described in 
Lemma IA.11I □ 



10. A MULTIPARAMETER COMPLETE FACTORIZATION THEOREM 



The last major task of the paper is to iterate Proposition 19.21 to deduce our a multi- 
parameter version of our main result, Theorem 1 1.191 We first need a technical lemma. 

Lemma 10.1 (Product of smooth sequences is smooth). Let G/Y be a nilmanifold of 
dimension m and let M ^ 2 and Ni, . . . , N t ^ 1 be parameters. Suppose that X is an 
M -rational Mal'cev basis for G/Y adapted to some filtration G, of degree d, and suppose 
that the maps 81,62 : Z* — > G are (M, N)-smooth in the sense of Definition \9.1[ Then 
the product EiE 2 is (M° d ' m < tlyl \ N)-smooth. 

Proof. First of all we have, for all n G N, 
By the triangle inequality we have 

d(e 1 e 2 (n - e i ),e 1 e 2 (n)) sC d{e 1 {n - ei)e 2 {n - e*j), £i(n)e 2 (n - e*)) 

+ d(ei(n)e 2 (n - ei),E l (n)E 2 {n)). 

Using the fact that d(ei(n), id©), d(e2(n), idc) ^ Q for all n G [N], the result now follows 
immediately from the right-invariance of d, Lemma IA.5I and Lemma IA.4I □ 

We can now state and prove the multiparameter version of Theorem 11.191 that we 
need. 

Theorem 10.2 (Multiparameter factorization theorem). Let s,m,t ^ 0, let M ^ 2 
and A > 0, and let N\, . . . , N t ^ 1 and d ^ 0. Suppose that G/Y is an m- dimensional 
nilmanifold with a Mo-rational Mal'cev basis X adapted to some filtration G, of degree 
d, and that g G poly(Z*, G % ). Then there is a some M, M < M < M ° A ' m ' d(1) , a 
subgroup G'CG which is M -rational with respect to X and a decomposition g = eg ,r y 
into sequences e, g', 7 G poly(Z*, G.) with the following properties: 
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(i) e is (M, N) -smooth; 

(ii) g' takes values in G' and with respect to the restriction of the metric d the orbit 
(g'(n)r')fi e p lX ... x p t is 1/ ' M A -equidistributed in G'/V, for any subprogressions 
P { C [Ni] with \P t \ > N l /M A ; 

(iii) 7 is a M -rational. 

Proof. Let l/M A = 5\ > 5 2 > ■ ■ ■ be a sequence of parameters to be specified as 
the proof unfolds. For each i — 1, . . . , t let Pi C [jV*] be a progression of size at least 
5\Ni. From Proposition 19.21 we know that either (g(n))fi e p lX ... x p t is <5i-equidistributed 
on G/T, or else there is a factorization 

9 = £\9il\ 

where £i,<7i,7i G poly(Z*, G.), g\ takes values in some 0(5 1 ~ 0(1) )-rational proper sub- 
group G' C G, £l is (0(5i iV)-smooth and 71 is 0(5 X ° (1) )-rational. Set P := GW; 
we are now going to look at the distribution properties of {g{n)) inside G'/V by applying 
Proposition 19.21 once more. 

To do this we choose an M^ A ' d ' m -rational Mal'cev basis X' for G'/V adapted to 
the filtration G' m := G, R G'. This is possible by Lemma [A. 10 \ and we may furthermore 

ensure that each of the basis elements X[ is an M^ A ' d ' m ^-rational combination of the 
Xi . In view of Lemma IA.6I we have 

d'(x, y) < M° A ' dMl) d{x, y) (10.1) 

for all x,ye G'/V. 

Take S 2 := cM _c for some constants c, C depending on m, d and A. If these are 
chosen suitably, and if (gi(n))fiePix-xPt is 52-equidistributed on G'/V with respect to 
the metric d' for all progressions Pi with \Pi\ ^ ^iVj, then by (110.11) the conclusion of 
the theorem holds. If this is not the case then we apply Proposition 19.21 once again, 
obtaining a factorization g\ = E292I2 where g 2 takes values in some 0(5 2 ^°^ 1 ' ) )-rational 
proper subgroup G" C G', e 2 : Z* — >■ G" is (0(<5^°^ 1 ' ) ), iV)-smooth and 72 : Z* — ► G" is 
O (5 2 ~° ( ' 1 ^ )-rational. 

This allows us to write 

g = £2t\92l\l2- 

Now it follows from Lemma [A. 61 that e 2 : Z* — > G" is in fact (M^ x \ iV)-smooth when 
regarded as a map into G (smoothness now being measured with respect to the metric 
d). By Lemma [Mil £ 2 ^i : Z* G is also (M O(1) , iV)-smooth. By Lemma EH (v) , 
7x72 : Z* — > G is )-rational. Thus, taking £ := £261, 7 := 7172 and g' := g 2 , 

the conclusion of the theorem holds unless (g 2 (n))fi e p lX ... x p t fails to be equidistributed 
on G"/T". We now proceed as before, introducing a Mal'cev basis X" and encoding 
this lack of equidistribution as the failure of (g 2 (n))HePix-xP t to be 5 3 -equidistributed 
relative to the metric d" = dx» for some 5 3 = cMq C (the constants c, C are, of course, 
not the same as before). We may then apply Proposition 19.21 once more, and so on. 

It is clear that the total number of iterations is bounded by m = dim G. The implied 
constants in the 0() notation increase with each iteration, but since the total number 
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of iterations is at most m = 0(1), this does not cause a difficulty. Thus we obtain a 
proof of our main theorem. □ 

It follows from Lemma IA. 121 (or rather the multidimensional version of it) that 
(7(n)r)jj e z« is periodic in each direction in the sense that 7(7? + Qe~i)T = j(n)T for 
some Q <C M° s < m >^ . Setting t = 1, we recover Theorem 11.191 

We leave the straightforward deduction of Theorem 11.201 to the reader. 

Appendix A. Facts about coordinates and Mal'cev bases 

Let us begin this appendix by discussing coordinate systems on a connected, simply- 
connected nilpotent Lie group G of dimension m. A discrete and cocompact subgroup 
r, leading to a nilmanifold G/T, will be introduced in a little while. Let g be the Lie 
algebra of G, and let exp : g — > G and log : G — > g be the exponential and logarithm 
maps, which are both diffeomorphisms. In this appendix all implied constants are 
allowed to depend on m and s, and for notational brevity this dependence will usually 
be suppressed. The rationality parameter Q will always be assuemd to be at least 2. 

Let us begin by recalling from §5] the notion of coordinates of the first and second 
kinds. 

Definition A.l (Coordinates). Let X = . . . ,X m } be a basis for g. If 

g = exp(t 1 X 1 H h t m X m ) 

then we say that (t\, . . . , t m ) are the coordinates of the first kind or exponential coordi- 
nates for g relative to the basis X. We write (ti, . . . , t m ) = ipx,e*p{g)- If 

g = exp(uiXi) . . . exp(u m X m ) 

then we say that (ui, . . . ,u m ) are the coordinates of the second kind for g relative to 
X, and we write (ui, . . . , u m ) = ipx(g)- 

From now on in this appendix (as in the main text) we will write ip := ipx an d 
"0ex P := i>x,exp- When another basis X' for some Lie algebra g' is present we shall write 
ip' := ipx> and V4 P : = ^x,exp- 

Recall that X is said to be Q-rational if all the structure constants in the relations 

k 

are rationals of height at most Q. 

The effect of a change of basis is easily understood in coordinates of the first kind 
(indeed, it merely effects a linear transformation of coordinates). Nilmanifolds, however, 
are best studied using coordinates of the second kind. It is, therefore, no surprise that 
the following lemma describing the passage between the two types of coordinate system 
is very useful. 

Lemma A. 2 (Coordinates of the first and second type), (i) Let X be a basis for g with 
the nesting property that 

[g,X 4 ] CSpan(X m ,...,X m ) (A.l) 
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for i — 1, . . . , m — 1. Then the compositions if) exp o ip -1 and if) o if)^L are both polynomial 
maps on M m with degree 0(1). If X is Q-rational then all the coefficients of these 
polynomials are rational of height at most . 

(ii) Suppose that G' C G is a closed, connected subgroup of dimension m' with asso- 
ciated Lie algebra g'Cg, Suppose X' is a basis for g' with the nesting property. Then 
if) o if;' -1 is a polynomial map from M. m to M. m and if)' o is a polynomial map from 
Q ^ m t° ^ m ■ Both of these maps have degree 0(1). If X and X' are Q-rational 
and if each element X[ of X' is a Q-linear combination of the X,- t then all coefficients 
of these polynomials are rationals of height . 

Proof, (i) Recall the Baker- Campbell-Hausdorff formula, which states that 

log(exp(X) expQO) = X + Y + \[X, Y] + [X, Y]\ - ^[Y, [X, Y]] + 

this expression being a sum of O s (l) terms, each of which is a rational number of height 
O s (l) times a commutator of order at most s involving Xs and Ys. Repeated use of 
this allows us to write exp(uiX\) . . . exp(u m X m ) in the form exp(tiXi + ■ • • + t m X m ). 
Property flA.lj) is easily seen to imply that the ti are polynomials in the Ui with the 



specific form 

k = Ui 

t 2 = u 2 + P 2 (ui) 

t2=U 3 + P 3 (U 1 ,U 2 ) 

t m = u m + P m {ui, . . . , w m _i). (-^--2) 

This establishes the claim for ip exp o ^ x . To prove the result for if) o ip~^ p we simply 
note that the relations (IA.20 are of an "upper triangular" form which is easy to invert. 
Thus the U{ are given in terms of the tj by polynomial relations of a similar upper 
triangular form. The quantitative statements follow by the same arguments, keeping 
track of the heights of the rational numbers involved. We leave the details to the reader. 

(ii) Note the decomposition 

^ o V/- 1 = (if, o o (W P o Cp 1 ) o (V4 P o 

Of the three maps here, the first one is a polynomial map from R m to R m by (i), and 
the third is a polynomial map from R m to R m . The middle map is simply a linear 
transformation from IR m to M. m . 

The composition if>' o ip~ l may be dealt with in a very similar manner. 

Once again the quantitative claims follow by the same arguments, keeping track of 
heights. We leave the details to the reader. □ 

The upper-triangular form of the relations (1A.2j) allows us to prove the following key 



result, which describes group multiplication and inversion in coordinates. 

Lemma A. 3 (Multiplication and inversion in coordinates). Let X be a basis for g with 
the nesting property (jA.ip . Let x,y G G, and suppose that if>(x) = t and if>(y) = u. 



54 



BEN GREEN AND TERENCE TAO 



Then 

ip(xy) = 

{ti + m,t 2 + U 2 + Pl(tx,Ux), ,...,t m + U m + P m _i(ti, . . . ,t m _i,Mi, . . . ,U m -i)), 

where, for each i = 1, ... ,m — 1, Pj : l ! x R' — > R is a polynomial of degree 0(1). 
Furthermore 

*Jj( x - 1 ) = (-t u -t 2 + Pi(ti), . . . , -t m + P m _i(ti, . . . , t m -l)) 

where Pi : R ? — > R zs a polynomial of degree 0(1). Let Q ^ 2. If X is Q-rational then 
all the coefficients of the polynomials Pi, Pi are rationals of height Q 0<yl \ 



Proof. By (IA.2j) we know that 



^exp^) ~~ ^2 + Pl(^l), • • • , t m + P m _i(ti, . . . , t m _l)) 

and similarly for ip exp (y), where Ri : R l — > R is a polynomial for i = 1, . . . ,m — 1. 
It follows from the Baker-Campbell-Hausdorff formula and the nesting property (lA.ip 
that 

(tl + Ui,t 2 + «2 + Sl(tl,ttl), . . . ,t m + U m + S m -l(ti, . . . ,t m _i,Ui, . . .,U m -i)), 

where each Si : W x R' J — >■ R is again polynomial. The statement about the form of 
ip(xy) now follows from a further application of the relations (IA.2I) . and the statement 
about ^(a; -1 ) is an immediate corollary of it. 

To obtain the quantitative versions of these statements we use the same arguments, 
keeping track of the heights of the rational numbers involved. We leave the details to 
the reader. □ 



Recall at this point Definition I2.2[ in which a basis X is used to define metric d = dx 
on G. We defined d to be the largest metric such that d(x,y) ^ {ip^xy^ 1 )] for all 
x,y G G, where | ■ | denotes the £°°-norm on R m . For practical purposes it is important 
to have an understanding of such metrics in terms of the coordinates ip(x) and ip{y), or 
even in terms of coordinates ip'(x), ip'(y) relative to some other basis X' . The following 
lemma provides some information in this regard. Here, and in the rest of this appendix, 
we write d := dx and d! := dx>. 

Lemma A. 4 (Bounds for d in terms of coordinates). Suppose that Q ^ 2. Suppose 
that X,X' are two Q-rational bases for g, both satisfying the nesting condition (IA.1I) . 
Suppose that each X- is given by a Q-rational combination of the Xi and vice versa. 
Then for all x,y G G with \ib'(x)\, \^\y)\ ^ Q we have the bound 

d(x,y)^Q°^\^(x)-^(y)\, (A.3) 

and for all x,y G G with d(x, ido), d(y, idc) ^ Q we have the bound 

\^(x)-^(y)\<^Q°^d(x,y). (A.4) 

Proof. Inequality flA.3|) is by far the easier of the two inequalities claimed here and we 
prove it first. By definition we have d(x, y) ^ \ip{xy^ x )\. Write i/j'(x) = t and ip'iv) = u 'i 
by Lemmas IA.2I and IA.3I we see that the coordinates %l)(xy~ l ) are 

(P x (t, «),..., P m (t,u)), 
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where each Pj : R m x R m — > R is a polynomial of degree 0(1) whose coefficients are 
rationals of height Q ^. Each of these polynomials of course vanishes when t = u, and 
so we can write (e.g.) 

m 

P 1 (t,u) = P 1 (t,u)-P 1 (t,t) = J2(ti-Ui)XiA t ,u), 

1=1 

where each : R m x R m — > R is a polynomial of degree 0(1) whose coefficients are 
rationals of height Q°^. (One way to see this is to expand Pi as a sum of monomials 
t?*vr .) The bound (1A.3j) follows immediately. 

The second bound, (1A.4|) . is significantly more difficult. We begin by proving the 
special case in which X = X 1 and y = idc, or in other words the following claim: 

|-?/>(x)| <C Q°^d(x, idc) uniformly for all x with d(x, id G ) ^ Q- (A. 5) 

Write K(x,y) := min(| , 0(x|/~ 1 )|, \ip(yx~ l )\). We will use the bound 

\i>(x) - i>(y)\ « Q°^K(x,y)(l + k(x, y) + |^(y)|)° (1) . (A.6) 

To prove this when n(x,y) = |?/>(:q/ _1 )| we proceed much as in the proof of (I A. Ill) : set 
x = zy and use Lemma [A.3I to expand ip(x) — ip(y) = ip{zy) — ip(y) as a polynomial 
in the coordinates of v — ip(y) and w = ip(z) which vanishes when w — 0. When 
n(x,y) = |-^(?/x _1 )| we proceed similarly, setting x = yz^ 1 . 

From (IA.6P we see in particular that if |^(y)| ^ 1 and n(x,y) ^ 1, then 

\t/>(x)\ ^ \ij(y)\ + CQ C K(x,y) 

for some constant 0^1. Iterating this we see that if elements of G with 

xq = idc and k(xo, xi) + . . . + k(x„_i, x n ) ^ C~ X Q~ C then 

\i>(x n )\ ^ CQ c (k(x ,x 1 ) + . . . + K(x n -i,X n )). 

Inspecting the definition of d, we conclude that 

\ip(x) | < Q° {1) d(x, id G ) whenever d(x, id G ) ^ C^Q' . (A.7) 

By right-invariance and symmetry of d, we can amplify this to 

\K{x,y)\ < Q° (1) d(x,y) whenever d{x,y) ^ C^Q' . (A.8) 

The estimate (1A.7j) is almost what we need, except that the bound on d(x, id G ) is too 
strict. To relax it, we argue as follows. To obtain f lA.5j) . it suffices to show that 

\i>(xn)\ < Q° {1 \k(xq,xi) + . . . + /c(x n _i,s n )) 

whenever x , . . . , x n 6 G with x Q = id G and k(x , x\) + . . . + n(x n -i, x n ) ^ 2Q (say). 

Using a greedy algorithm, split the path (x , • • • , x n ) into 0(Q°^) paths (xj, . . . , Xj) 
with «(xj,Xj + i) + . . . + n(xj^i,Xj) ^ _1 Q~ C ', plus 0(Q ^) singleton paths (xj,Xj + i) 
with C~ X Q~ C ^ /«(xj,Xj + i) ^ 2Q. Applying (IA.8p . we thus see that there exists a path 
(y , ...,y r ) with r = 0(Q° (1) ), y = id G , and y T = x n , such that ufayi-x) < Q° (1) for 
all 1 ^ z ^ r. In particular (using Lemma [A.3j) if we write (7i := y%yl[\ for 1 $C % ^ r, 
then we see that iVKfiOl ^ <3° ■ On the other hand, we have the telescoping product 

Xn = 9r- --91- 
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Now if gi, . . . , g r G G are any elements with IV'G?*)! ^ t for all i then 

|^(^i...^)|«(l + t) 0(1) r°«. 

This may be seen by applying Lemma IA.3I repeatedly to expand the product out com- 
pletely in coordinates. That the first coordinate is polynomially controlled is obvious, 
and it then follows that the second is also, and so on inductively. Applying this in the 
present situation gives |-0(a; n )| -C Q°^\ and similar arguments for each i give that in 
fact | ^ 0^)1 ^ uniformly for ^ i ^ n. Applying flA.6h we have 

\^{ Xi )\ ^ +0(Q° {1) k(x 1 . 1 ,x 1 )) 

and (IA.5P follows. 

We have just established the special case X = X', y = id G of flA.4p . We now deal with 
the case where X = X' but y is arbitrary. Suppose then that d(x, idc), d(y, idc) ^ Q- 
Applying (IA.5|) we see that ^(z)!, \ip(y)\ <C Q°^. By Lemma IA.3I we therefore have 
I^O^T 1 )! < Q° w , and hence by flA.3p it follows that d{xy~ l , id G ) < Q° {1) . Applying 
flA. 5|) once more, we see that 

Q°Md{xy-\ id G ), 

which, since d is right-invariant, implies that 

mxy-^l^Q^d^y). (A.9) 

The claimed result now follows immediately using (I A . 6 [) . 

Finally we turn to the general case in which X and X' may be different. We start 
with the special case of (1A.4|) just proved, namely 

\ij(x)-ij(y)\<t:Q ow d(x,y). (A.10) 

Applying flA.3j) we obtain 

d'{x,y) « g° (1) |^(x) - ^{y)\ « Q° {1) d(x,y). 

In particular we have d'(x, id G ), d'(y, id G ) <C ■ A second application of (lA.lOj) . 
with X replaced by X', then gives 

W{x)-^\y)\ « Q°^d\x,y) « Q° w d(x,y). 

This concludes the proof of Lemma IA.41 □ 

The metric d is right-invariant, that is to say d(xg,yg) = d(x,y) for all x,y,g G G. 
It is useful to have, in addition, the following approximate left-invariance property. 

Lemma A. 5 (Approximate left-invariance of d). Suppose that Q ^ 2 and that X is a 
Q-rational basis for g satisfying the nesting condition (lA.lj) . Suppose that g,x,y G G 
are elements with \ip(x)\, \ip(y)\, \ip{g)\ ^ Q- Then we have the bound 

d(gx,gy) ^Q° w d(x } y). 

Proof. We start by observing that uniformly in g, z G G we have the bound 

mgzg- 1 )] « g°W(l + \^(z)\ + mg)\)°^(z)\. (A.ll) 

This follows by using Lemma IA.3I to conclude that the components of iplgzg -1 ) are 
polynomials of degree 0(1) with Q^^-rational coefficients in the coordinates v = ip(g) 
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and w = ip(z), and these polynomials all vanish when w = 0. Recall from Definition 
Othat 

n-1 

d(x,y) = inf{^min(|^(x i _ix^ 1 )|, \ijj{xiXj} x )\) : x , . . . , x n E G; x = x; x n = y}. 

i=0 

(A.12) 

We see, then, that the lemma will follow from (lA.lip (taking z = XiX { \ or x^ix i 1 ) if 
we can show that the infimum may be taken over all those Xi,Xi-± which satisfy some 
bound mm(\^p(xi-iX~ 1 )\,\-ip(xiX^\)\) <C . But this follows from the inequality 

d(x, y) <C , which is an instant consequence of Lemma IA.41 □ 

We conclude this subsection by recording the following result. 

Lemma A. 6 (Comparison lemma). Suppose that G' C G is a closed subgroup and that 
X, X' are bases for q, g' respectively which have the nesting property OA.ip . Let Q ^ 2, 
and suppose that each X[ is a Q-rational combination of the Aj. Then we have the 
bounds 

d'(x,y)<^Q ow d(x,y) 
uniformly for all x,y E G' with ^(a;)!, \i>{y)\ ^ Q and 

d(x } y)^Q° (1) d'(x,y) 
uniformly for all x,y E G' with \i])'(x)\, \ij)'{y)\ ^ Q. 

Proof. We follow essentially the same argument used in the previous lemma. To 
prove the first bound, for example, replace (lA.lip with the bound 

|^)I«Q 0(1) (1 + I^)I) 0(1) I^)I- 

This follows immediately from Lemma [A. 21 (ii), which guarantees that ip'(z) is a poly- 
nomial in the coordinates ip(z) which vanishes when ip{z) =0. □ 

Mal'cev bases. Suppose that G is a connected, simply-connected nilpotent Lie 
group with a filtration G,. Let us now introduce a discrete and cocompact subgroup 
T to the discussion. Throughout the paper we have assumed that G/Y comes together 
with a special type of basis X called a Mal'cev basis adapted to G 9 , which is invoked 
whenever it is necessary to discuss the metric structure of G/Y. 

Let us recall from §2] the basic properties of these bases: 

(i) For each j = 0, ... ,m — 1 the subspace f)j := Span(Xj + i, . . . ,X m ) is a Lie 
algebra ideal in g, and hence Hj := exp t)j is a normal Lie subgroup of G. 

(ii) For every i, ^ i ^ s, we have Gi = H m __ dim ( G .^ (or equivalently, Qi = 

fym-dim(gi))] 

(iii) Each g E G can be written uniquely as exp(t 1 A 1 ) . . . exp(t m X m ), for t\, . . . , t m E 
R. 

(iv) r consists precisely of those elements which, when written in the above form, 
have all ii, . . . , t m E Z. 

Mal'cev bases are not especially flexible in certain ways - for example it is not at all 
easy to take a Mal'cev basis on G/Y and use it to construct one on G n /Y n as we had to 
do in the proof of Lemma YTM For additional flexibility it is convenient to introduce the 
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notion of a weak basis for G/Y. These are only ever used in the process of constructing 
actual Mal'cev bases with desirable properties. 

Definition A. 7 (Weak bases). Let X = {X 1 , . . . ,X m } be a basis for g. Let Q ^ 2 be 
a parameter. We say that X is a Q-rational weak basis for G/Y if X is Q-rational (cf. 
Definition 12. 4p and if we have ^Z m C ip cxp (Y) C gZ m for some q ^ Q, that is to say the 
coordinates of log Y relative to X are close to being integers. 

Note carefully that logT is not necessarily a subgroup of g, as we saw in $5] in 
connection with the Heisenberg example. 

We record some simple facts about weak bases. 

Lemma A. 8 (Weak bases: simple facts). Weak bases enjoy the following properties. 

(i) Suppose that X is a Q-rational weak basis for G /Y , and that X' = {X[, . . . , X^} 
is another basis for q with the property that each X- is a Q-rational combination 
of the Xj. Then X' is a Q°^ -rational weak basis for G/Y. 

(ii) Suppose that X is a Mal'cev basis adapted to some subgroup sequence G,, that 
is to say conditions (i), (ii), (iii) and (iv) from the start of the section are 
satisfied. Suppose that X is Q-rational. Then X is an 0(Q°^) -rational weak 
basis for G/Y. 

Proof. Part (i) is immediate. Part (ii) follows quickly from Lemma IA.2I □ 

The next proposition allows us to construct Mal'cev bases from weak bases. If X is a 
Mal'cev basis for G/Y and if G' C G is a subgroup, we say that G' is Q-rational if the 
Lie algebra g' is generated by Q-rational combinations of the basis elements Xj. 

Proposition A. 9 (Construction of Mal'cev bases). Suppose that X is a Q-rational 
weak basis for G/Y and that G, is a filtration in which each subgroup Gi is Q-rational. 
Then there is a Mal'cev basis X' = {X[, . . . , X' m } for G/Y adapted to G, in which each 
X[ is a Q°^ -rational combination of the basis elements Xj. In particular, the Mal'cev 
basis X' is Q°^ -rational. 

Proof. Take a basis for g^ consisting of Q-rational linear combinations of the Xj. By 
straightforward linear algebra this may be extended to a basis of Qd-i consisting of Q°^- 
rational combinations of the Xj. This in turn may be extended to a basis of Qd-2 and so 
on. In this fashion we obtain a basis y = {Yi, . . . , Y m } for g as a vector space consisting 
of Q° < - 1 ^-rational combinations of the Xj such that each g« equals Span(Y}+i, . . . ,Y m ) 
where j = m — mj. By Lemma [A. 8 1 (i) we see that y is a Q^^-rational weak basis for 
G/Y. 

Since [g, g$] C g i+1 for all i we see that the weak basis y enjoys the nesting property, 
that is to say [g, Yf\ C Span(Y, + i, . . . , Y m ) for all j. 

We now convert this basis 3^ into the desired Mal'cev basis by choosing X' m = 
c m Y m , . . . , X[ — c{Yi in turn so that 

Span(Y m , . . . , Y m ) n Y = {exp(n m X- +1 ) . . . exp(n m X^J GZ} (A.13) 

for i = m — 1, . . . , 0. Such a basis X' has all of the properties (i), (ii), (iii) and (iv) 
required to qualify as a Mal'cev basis. Suppose this is done for i — j. Since y is a 



THE QUANTITATIVE BEHAVIOUR OF POLYNOMIAL ORBITS ON NILMANIFOLDS 



59 



Q°( 1 )-rational weak basis for G/Y we see that 

( Span(F i , . . . , Y m ) fir)/ Span(Y,-+i, . . . , Y m ) 

is generated by exp(cjYj) for some Cj G Q with heights bounded by Q°^ . Taking 
Xj := CjYj, we see that (IA.13I) holds for i = j — 1 too. □ 

For applications (for example in the proof of Lemma I7.4p it is convenient to have the 
following variant of the above proposition. 

Proposition A. 10 (Mal'cev bases of subnilmanifolds) . Suppose that X = {Xi, . . . , X m } 
is a Q -rational Mal'cev basis for G/Y adapted to a filtration G,. Suppose that G' C G 
is a Q-rational subgroup of G, and furthermore that G' 9 is a filtration on G' in which 
each of the groups G\ is Q-rational (with respect to the basis X). Write Y' :— Y n G' . 
Then G' /Y' has a Mal'cev basis X' = {X[, . . . ,X' m ,} adapted to G', in which each X[ is 
a -rational combination of the X i . 

Proof. One simply observes that by linear algebra there is a basis 3^ = {Yi, • • • , Y m >} 
for g' together with an extension y = {Yi, . . . , Y m } to a basis for q such that each of the 
Yi is a Q^^-rational combination of the Xj. By Lemma [A. 8[ y is a weak basis for G/Y, 
and therefore y is a weak basis for G'/Y'. The result now follows from Proposition IA. 91 
applied to this weak basis. □ 

Rationality. We now record some simple results about rational points in nilmani- 
folds G/Y. Recall Definition 11.111 g G G is rational if g r G Y for some integer r > 0. 
Recall also the quantitative version of this, Definition 1 1.171 g G G is Q-rational if g r G Y 
for some integer r, < r ^ Q. 

Lemma A. 11 (Properties of rational points). Suppose that X is a Q-rational Mal'cev 
basis adapted to some subgroup sequence G,, where Q ^ 2. 

(i) If 7 G G, then 7 is rational if and only ifipi^y) G Q m . 

(ii) The set of rational points in G is a group. 

(hi) If 7 G G is Q-rational, then ip(l) G ^?^ m for some Q', 1 ^ Q' < Q°^\ which 
does not depend on g. 

(iv) If ' 'j G G is such that ^(7) G ^^ m , then 7 is O (Q°^) -rational. 

(v) 7/7,7' are Q-rational, then 77' and are 0(Q°^) -rational. 

Proof. If 7 is rational, then by definition there exists r ^ 1 such that Y G T, and 
thus ip(j n ) G Z m whenever n is a multiple of r. Now from Lemma 16.71 we know that 
the coordinates ip(g n ) are all polynomials of degree 0(1); these vanish at zero, and take 
integer values at multiples of r. By the Lagrange interpolation formula we conclude 
that all the coefficients of these polynomials are rational, and so in particular we have 
V( 7 )GQ m . 

Suppose conversely that ^(7) G Q m . Then by Lemma [A.3I we see that each of i/j^ 2 ), 
-0(7 3 ), . . . also lies in Q m . By another application of Lemma 16.71 and the Lagrange 
interpolation formula we conclude that each coordinate of ip(j n ) is a polynomial with 
rational coefficients which vanishes at zero. In particular it is easy to see that by 
choosing r G N suitably we may ensure that ^(Y) ^"N which of course implies that 

7 r g r. 
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Part (ii) follows immediately from (i) and Lemma [A. 3 1 

Claims (iii)-(v) follow by repeating the above arguments, but keeping track of the 
heights of all the rational numbers involved; the key point is that the group operations, 
as well as Lagrange interpolation, are all polynomial in nature and so all heights will 
be 0(Q ^'). We omit the routine details. □ 

Let us now recall the notion of a rational sequence, also given in Definition 11.111 A 
sequence 7 : Z — > G is rational if 7(^)T is rational for all n, and it is Q-rational if 
7(n)r is rational for all n. The next lemma records some useful properties of rational 
polynomial sequences. 

Lemma A. 12 (Properties of rational polynomial sequences). Suppose that 7 : Z — > G 
is a polynomial sequence of degree d. 

(i) Suppose that 7 is rational. Then 7(n)T is periodic. 

(ii) Suppose that there is a Q -rational Mal'cev basis X for G/Y and that 7 is Q- 
rational. Then j(n)Y is periodic with period <C Q°^ . 

Proof, (i). Let X be any Mal'cev basis for G/Y. By Lemma 16.71 the coordinates 
if)( / -f(n)) are all polynomials of degree 0(1), and by the previous lemma and the Lagrange 
interpolation formula they all have rational coefficients. Clearing denominators, we thus 
find some q such that ip(^(n)) G ^Z m for all integers n. By Lemma IA.3I we see that 
there is some q' G N such that, for any r G Z, we have ip{^{n + r)7(n) _1 ) G ^-Z" 1 . 
Thus 7(n)T is indeed periodic, with period qq'. 

Part (ii) is proved in exactly the same way, once again taking care to keep track of 
the heights of all rationals involved. □ 

We leave the formulation and proof of the multidimensional version of this lemma 
(that is, concerning maps 7 : Z* — > G) to the reader; only trivial modifications are 
required. 

The next result, stating that conjugates of rational subgroups by rational elements 
are rational, is not needed in the present paper. It is required in the companion paper 

m. 

Lemma A. 13 (Rational conjugates). Suppose that X = {Xi, . . . ,X m } is a Q-rational 
Mal'cev basis for G/Y adapted to some filtration. Suppose that 7 G G is Q-rational and 
additionally that the coordinates are a ^ bounded in magnitude by Q. Suppose that 
G' C G is a Q-rational subgroup. Then the conjugate ^G'^~ l is Q ^ -rational. 

Proof. Set H := 7C7 -1 and let f) be the corresponding Lie algebra. Recall from 
basic Lie theory the identity 

log(7exp(X) 7 - 1 )=Ad(7)X, 

where Ad(7) : Q — » g is the adjoint automorphism of g associated to the element 7 G G. 
For the purposes of this argument all we need is the following immediate consequence 
of this identity: if X[, . . . , X' m , is a basis for the Lie algebra q' then the elements 

A > J :=log(7exp(X;) 7 - 1 ) 
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are a basis for f). By assumption we may choose the X[ to be Q-rational combinations 
of the Xj. It then follows from Lemmas IA.2I and I A. 3 1 that each is a Q^^-rational 
combination of the Xj. 

Fundamental domain and reduction. The next lemma provides a description 
of G/Y in terms of coordinates relative to any Mal'cev basis X. 

Lemma A. 14 (Reducing to the fundamental domain). Let X be a Mal'cev basis adapted 
to some subgroup sequence G,. Suppose that g G G. Then we may write g = {g}[g} in 
a unique way, where ip({g}) G [0, l) m and [g] G Y. 



Proof. Recall Lemma \A.3\ which describes the multiplication on G in coordinates 
relative to X. Using this we may iteratively construct j m ,j m -i, ... ,71 G T in such a 
way that coordinates i + 1, . . . , m of V ; (5 , 7m • • • 7i) all lie in the interval [0, 1). 

The uniqueness also follows easily from Lemma IA.31 if ^(27), ip(x) G [0, l) m then we 
may equate coefficients of ^(7) starting at the right to deduce that 7 = idc- □ 

Metrics on nilmanifolds. Let X be a Mal'cev basis for some nilmanifold G/Y. 
Recall from Definition 12.21 the manner in which we used the metric d = dx on G to 
define a "metric" on G/Y via 

d(xY, yY) = inf dlx^f, 2/7') • 
7,7'er 

We can now prove that d really is a metric on G/Y (and thus the inverted commas 
above can be dispensed with). 

Lemma A. 15 (Nondegeneracy of metric). Suppose that X is a rational Mal'cev basis 
for a nilmanifold G/Y, adapted to some filtration. Suppose that d(xY,yY) = 0. Then 
x = y(modr). 

Proof. Since the metric d on G is right-invariant we have 

d(xY,yY) = inf d(x,yy). 

It suffices to show that the inf here is a actually a minimum, to which end we need 
only show that for any M there are just finitely many 7 G Y with d(x, yy) ^ M. By 
Lemma (1A.5j) this assumption implies that d(y~ 1 x ) r f) ^ M', for some M' depending on 
M, the rationality of the Mal'cev basis X and the size of the coordinates of x and y. 
This in turn implies that d(idc, 7) ^ M" which, in view of Lemma IA.4} implies that 
|^(7)| ^ M'". But if 7 G T then the coordinates ^(7) are all integers, so the result 
follows. □ 

Lemma A. 16 (Nilmanifolds are bounded). Let Q ^ 2, and suppose that X is a Q- 
rational Mal'cev basis for a nilmanifold G/Y (with respect to some filtration). Then 
d(xY,yY) Q 0( -^ uniformly in x,y G G. 

Proof. By Lemma [A. 141 we may choose 7 and 7' so that ^(a^)!, \ip(yj')\ ^ 1. The 
claim now follows immediately from Lemma IA.4I □ 

The final result of this appendix is not used in this paper but is required in §2 of the 
companion paper [T3] . 
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Lemma A. 17 (Comparison of metrics on nilmanifolds) . Let Q ^ 2. Suppose that 
G' C G is a closed subgroup and that X, X' are Q-rational Mal'cev bases for G/Y and 
G'/Y' respectively such that each X[ is a Q-rational combination of the Let d,d' be 
the metrics induced on G/Y and G'/Y' respectively. Then for any x,y G G' we have 

d'{xY',yY') < Q°^d(xY,yY) 

and 

d(xY,yY) <^Q° {l) d'(xY\yY'). 

Proof. We prove the second inequality first. By the proof of Lemma IA. 151 there is 
some 7' G Y' such that d'(xY',yY') = d'(x,yy'). Here we may assume, using Lemma 
IA.144 that \ip'(y)\ ^ 1. By Lemma lA.16l we have d'(x. yy') ^ Q°^ l \ and therefore 

by Lemma IA.4I and the triangle inequality we have d' (idc , yy') "C ■ By a second 
application of Lemma IA.4I it follows that |0'(y7')| *C ■ By Lemma rA.6l we therefore 
have d(x,yy') <C Q° ( - 1 ^d'(x,y'j'). Since Y' C Y, this implies that 

d(xT,yT) ^ d{x, yi ') < Q°^d'(x,yj') = Q°^d'(xY' ,yF), 

which is the second inequality claimed. 

To prove the first inequality we make the same initial manoeuvres. That is, we may 
assume that l"0(2/)l ^ 1 an d that there is some 7 G Y such that d(xY,yY) = 

d(x,yy). Let C be a constant to be specified later. If d(x,yy) ^ Q~ c then, by Lemma 
IA.16[ the bound is trivial. Suppose, then, that d(x, yy) < Q~ c . This is an assertion to 
the effect that 7 lies "near" G'. We will use the rationality properties of the coordinates 
of T to conclude from this that 7 must actually lie in G'. 

By Lemma IA.5I and Lemma IA.3I we obtain d(z, 7) <C Q°^~ c , where z := y~ l x. 
Since d(z, idc) <C Q ot ^' we have 0^(7, idc) <C Q°^\ and so by Lemma TA.4I it follows 
that \if)(z) - '0(7)| < Q ^ . It follows from this and Lemma \K2\ that 

|0exp(z) - ^exp(7)l « Q° (1) ~ C . (A.14) 

Now G' is defined, in exponential or type I coordinates, as the intersection of the kernels 
of 0(1) linear forms with rational coefficients of height 0(Q°^). The coordinates "0(7) 
are integers and so the type I coordinates eX p(7) are, by Lemma |A~2| rationals of height 
0(Q°W). The element z, of course, lies in G'. If C is chosen sufficiently large, it follows 
from these observations and flA. 14|) that indeed 7 lies in G' and hence in Y'. 

We now have that d{x,yy') <C Q 0< ^\ where 7' = 7 lies in Y' . One final application 
of Lemma [A.6I implies that d'(x,yy') <C Q° ( - 1 ^d(x,yy'), from which it of course follows 
that 

d'(xY',yY') ^ d'{x,yi) < Q° W d(x, yi) = Q°« 'd(xY \ yY) . 
This concludes the proof. □ 
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THE QUANTITATIVE BEHAVIOUR OF POLYNOMIAL ORBITS ON 

NILMANIFOLDS 



BEN GREEN AND TERENCE TAO 



Abstract. A theorem of Leibman [35] asserts that a polynomial orbit (g(n)T) ne z 
on a nilmanifold G/T is always equidistributed in a union of closed sub-nilmanifolds 
of G/T. In this paper we give a quantitative version of Leibman's result, describ- 
ing the uniform distribution properties of a finite polynomial orbit (g(n)T) ne [N] in a 
nilmanifold. More specifically we show that there is a factorization g = eg'"/, where 
e(n) is "smooth", {^{n)T) n ^z is periodic and "rational", and (g' (n)T) ne p is uniformly 
distributed (up to a specified error 5) inside some subnilmanifold G'/F' of G/T for all 
sufficiently dense arithmetic progressions P C [N]. 

Our bounds are uniform in N and are polynomial in the error tolerance 5. In a com- 
panion paper (T3] we shall use this theorem to establish the Mobius and Nilsequences 
conjecture from our earlier paper |12) . 



1. Introduction 

Nilmanifolds. In the last few years it has come to be appreciated that nilmani- 
folds, together with orbits on them, play a fundamental role in combinatorial number 
theory. Their relevance was certainly apparent in [8], and it has been displayed quite 
dramatically in recent ergodic-theoretic work of Host-Kra [16] and Ziegler [35] . More 
recently the authors have explored how nilmanifolds arise in additive combinatorics [10] 
and in the study of linear equations in the primes [12]. The present paper is a part of 
that programme (and in particular will be used to prove the Mobius and Nilsequences 
conjecture from [T5] in the companion [T3] to this paper) but, since it concerns only 
the intrinsic properties of nilmanifolds, may be read independently of any of the other 
work. The reader interested in the background may consult the surveys [HI [EHJ ED] or 
the paper [T2] . 

We begin by setting out our notation for nilmanifolds. 

Definition 1.1 (Filtrations and Nilmanifolds). Let G be a connected, simply connected 
Lie group with identity element id^. For the purposes of this paper we define a filtration 
G, on G to be a sequence of closed connected subgroups 

G = G = G 1 DG 2 D---DG d D G d+l = {id G } 

which has the property that [Gi,Gj] C G i+ j for all integers i,j ^ 0. The least integer 
d for which Gd+i = {^g} is called the degree of the filtration G, and here, as usual, 
the commutator group [H, K] is the group generated by {[h, k] : h £ H, k G K}, where 
[h, k] := hkh~ x k~ l is the commutator of h and k. If G possesses a filtration then we 
say that G is nilpotent. Let r C G be a uniform subgroup (i.e. a discrete, cocompact 
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subgroup). Then the quotient G/T = {gT : g G G} is called a nilmanifold. We also 
write g(mod V) for gT. 

Throughout the paper we will write m = dimG and rrii = dimGi, % = 1, . . . , d. 

Remark. The assumptions of connectedness and simple-connectedness for G are not 
completely standard, but are very convenient for us. In any situation in which we 
apply our theorems, we expect to be able to reduce to this case. If a filtration G, of 
degree d exists then it is easy to see that the lower central series filtration^ defined by 
G = Go = Gi, Gi+i = [G,Gi] terminates with G s+ i = {idc} for some integer s ^ d. 
We call the minimal such integer s the step of the nilpotent Lie group G. In this paper 
the degree d will play a vastly more important role than the step s, since it will be 
important to work with nitrations more general than the lower central series. 

Examples. The simplest examples of nilmanifolds arise when s = 1 in which case 
we may, after a linear transformation, take G = lR m and T = Z m . The lower central 
series filtration is given by G = Go = G\ and G 2 = {id^}. The nilmanifold G/T is 
then referred to as a torus. Note that in this example the group operation is written 
additively, as is conventional for abelian groups. When we are working with non-abelian 
groups we shall write the group operation multiplicatively. The simplest non-abelian 
example is given by the 3-dimensional Heisenberg nilmanifold, in which s — 2. We will 
study this object in some detail later on. Here we take 

G=(h R it) and r=fS!iV (1.1) 
Vooi/ \o o i / 

The lower central series filtration is given by G = Go = G\, 

r< / 1 R\ 

G 2 = o i o 
\o o \) 

and Gs = {idc}- Observe that a fundamental domain for the action of F on G is 

1 XI X2 \ ~) 

o i xs ) : < xi,x 2 ,x 3 < 1 > . (1.2) 

1/ J 

Thus one can view G/T as a unit cube, with the sides glued together in a twisted 
fashion. o 

This paper will be concerned with the qualitative and quantitative equidistribution 
of various algebraic sequences on nilmanifolds. We first set out our notation for equidis- 
tribution. 

Definition 1.2 (Equidistribution). Let G/T be a nilmanifold. Here and in the sequel 
we endow G/T with the unique normalised Haar measure, we let [N] := {n £ Z : 1 ^ 
n ^ iV}, and we write K ae Af(a) := r^r J2aeA /(A) f° r ^ ne avera g e of / on the set A. 

(i) An infinite sequence (g(n)r) ng N in G/T is said to be equidistributed if we have 

lim E ne[N] F{g{n)T) = f F 

N^oo J G/T 

for all continuous functions F : G/T — > C. 



1 It is not hard to see that the lower central scries filtration is a filtration, in that we have [Gi, Gj] C 
Gi+j for all 
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(ii) An infinite sequence (g(n)T) neZ in G/T is said to be totally equidistributed if 
the sequences (g(an + r)T) neN are equidistributed for all a G Z\{0} and r G Z. 

(iii) Given a length N > and an error tolerance 5 > 0, a finite sequence (g(n)T) ne [ N ^ 
is said to be 5 -equidistributed if we have 



E n£[N] F(g(n)T) - \ F 
G/r 



<S\\F\ 



Lip 



for all Lipschitz functions F : G/T — > C, where 

\F(x)-F(y)\ 
dG/v{x,y) 



I Lip := p ||oo + sup 



x,yeG/T,x^y a G/T 



and the metric G?G/r on G/T will be defined in Definition 12.21 in the next section 
(it will involve choosing a Mal'cev basis X for G/T). 
(iv) A finite sequence (g(n)T) ne \m is said to be totally 5 -equidistributed if we have 



E neP F(g(n)T) - [ F 

JG/V 



^ S\\F\ 



Lip 



I G/T 

for all Lipschitz functions F : G/T — > C and all arithmetic progressions P C [AT] 
of length at least 5N. 

We will be interested in the qualitative question of when a sequence (g(n)T) ne ^ is 
equidistributed (or totally equidistributed), as well as the more quantitative question of 
when a finite sequence (g(n)T) ne [^] is 5-equidistributed (or totally 5-equidistributed). 
Such questions, and corresponding questions in more general settings (for example when 
G/T is a homogeneous space of a general, not necessarily nilpotent, Lie group) play a 
fundamental role in number theory; see [M] for a discussion. These questions are also 
closely related to the celebrated theorem of Ratner [28] on unipotent flows, although as 
we are restricting attention to nilmanifolds, we will not need the full force of Ratner's 
theorem (or quantitative versions thereof) here. 

Qualitative equidistribution theory of linear sequences. To begin the 
discussion let us first restrict attention to linear sequences. 

Definition 1.3 (Linear sequences). A linear sequence in a group G is any sequence 
g : Z — > G of the form g(n) := a n x for some a, x G G. A linear sequence in a nilmanifold 
G/T is a sequence of the form (g(n)T) ne %, where g : Z — > G is a linear sequence in G. 

In the additive case G = M m , T = Z m , a linear sequence takes the form (an + 
x(mod Z m ))„ eZ . In this case one can understand equidistribution satisfactorily using 
Kronecker's theorem and its variants. For instance, to answer qualitative questions 
about equidistribution in this case, we have the following classical result. 

Theorem 1.4 (Qualitative Kronecker theorem). Letm ^ 1, and let (g(n) (mod Z m )) nG ^ 
be a linear sequence in the torus IR m /Z m . Then exactly one of the following statements 
is true. 

(i) (g(n) (mod Z m )) ng N is equidistributed in R m /Z m . 

(ii) There exists a non-trivial character n : M m — > M/Z, i.e. a continuous additive 
homomorphism which annihilates Z m but does not vanish entirely, such thatrjog 
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is constant. (Equivalently, if g(n) = an + x, there exists a non-zero k e Z m 
such that k ■ a G Z.) 

In particular, (g(n)(mod Z m )) ng ^ is equidistributed if and only if it is totally equidis- 
tributed. 

Remarks. An equivalent formulation of this theorem is that if the linear sequence 



is not equidistributed, then this sequence instead takes values in a finite union of proper 
subtori of G/Y. This can be viewed as an extremely simple special case of the theorems 
of Ratner [28] and Shah [29]. More quantitative results can be obtained via Fourier 
analyst; see Proposition 13. II below. 

A remarkable theorem of Leon Green allows one to reduce qualitative questions about 
the distribution of orbits on nilmanifolds of step s > 1 to the abelian case just described. 

Definition 1.5 (Horizontal torus). Given a nilmanifold G/Y, the horizontal torus is 
defined to be (G/Y)^ := G/[G,G]Y. We let n : G — > (G/T) a b be the canonical projec- 
tion map. A horizontal character is a continuous additive homomorphism 77 : G — > R/Z 
which annihilates Y; observe that such characters in fact annihilate [G, G]Y and so can 
be viewed as characters on the horizontal torus. We say that a horizontal character is 
non-trivial if it is not identically zero. 

It follows from results of Mal'cev [25], and in particular the existence of so-called 
Mal'cev bases, that (G/r) a b really is a torus and in fact is isomorphic to R mab /Z mab 
where m a b := dini^G) — dim]g([G, G}). We will not actually need this characterisation, 
as the properties of horizontal characters 77 : G — > R/Z will be our main focus. Readers 
may find it useful to keep this in mind, however. 

Theorem 1.6 (Leon Green's theorem). Let (g(n)Y) ne i be a linear sequence in a nil- 
manifold G/Y. Then the orbit {g{n)Y) n£ fq is equidistributed in G/Y if and only if the 
projected orbit (7r(g(?2)r))„ g N is equidistributed in the horizontal torus (G/Y)^. (In 
particular, (g(n)Y) ne i is equidistributed if and only if it is totally equidistributed.) 

Proof. See [H [TJ]. Leon Green used representation theory to establish his result, 
but a more elementary proof was subsequently found by Parry |26j . □ 

Example. Suppose that G/Y is the Heisenberg example (II. ip . Then 



(<7(n)(mod Z m )) neN 




and (G/r)ab may be identified with R 2 /Z 2 , the projection ir being given by 




Leon Green's theorem implies that the orbit (a n r) nS N, where 




In this simple setting one could also use more classical tools such as Minkowski's geometry of 
numbers, and in the m = 1 case one could even use continued fractions. However, these methods do 
not seem to extend easily to higher steps. 
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is equidistributed in G/Y if and only if 1, a\ and 0:3 are independent over Q. It is already 
somewhat nontrivial to establish this result directly. o 

By Kronecker's theorem, we can then recast Theorem 11.61 in the following equivalent 
formulation: 

Theorem 1.7 (Leon Green's theorem, again). Let (g(n)Y) n< z% be a linear sequence in 
a nilmanifold G/Y. Then exactly one of the following statements is true: 

(i) (g(n)Y) ne fq is equidistributed in G/Y. 

(ii) There exists a non-trivial horizontal character 77 : G — > R/Z such that rj o g is 
constant. 

Qualitative equidistribution theory of polynomial sequences. While 
our primary applications are concerned with linear sequences, it turns out for various 
technical reasons that it is important to work in the more general class of polynomial 
sequences. 

Definition 1.8 (Polynomial sequences in nilpotent groups). Suppose that G is a nilpo- 
tent group with a filtration G,. Let g : Z — > G be a sequence. If h G Z we write 
dh9 '■= g{ nJ r h)g(n)~ x . We say that g is a polynomial sequence with coefficients in G,, 
and write g G poly(Z, G m ), if ■ ■ ■ d^g takes values in Gi for all positive integers % and 
for all choices of /ti, . . . , /ij G Z. In this case we say that g has degree d. If g lies in 
poly(G.) for some filtration G, then we simply say that g is a polynomial sequence. 

This definition is a little abstract. However we will show in £0 that g : Z — > G 
is a polynomial sequence if and only if g has the form g(n) = a^ 1 ^ . . . aj^ , where 
ai, . . . ,a,k G G and the Pi : N — > N are polynomials. In particular a linear sequence 
g(n) = a n x is a polynomial sequence, and in fact since d^gin) = a hl and dj Kl df il g{n) = 
idc it is clear that such a sequence has coefficients in the lower central series filtration 
G.. Note carefully that the degree of a linear sequence is equal to the step s of the 
underlying Lie group G, and is not equal to one as the name "linear" might suggest. 

A remarkable result of Lazard and Leibman [TjJl [201 EI] asserts that poly(Z, G,) is a 
group. We will prove this in §H1 and it will play a key role in several of our arguments. 

Theorem 11.61 was extended by Liebman [22] to the case when g(n) is a polynomial 
sequence rather than a linear one. In particular, he showed the following generalisation 
of Theorem 11.71 

Theorem 1.9 (Leibman's theorem). |22j Suppose that G/Y is a nilmanifold. and that 
g : Z G is a polynomial sequence. Then exactly one of the following statements is 
true: 

(i) (g(n)r) ngN is equidistributed in G/Y. 

(ii) There exists a non-trivial horizontal character 77 : G — > R/Z such that rj o g is 
constant. 

Remark. This theorem significantly generalizes the classical theorem of Weyl that 
a polynomial sequence in R/Z is equidistributed unless all of its non-constant coeffi- 
cients are rational. We will in fact use a quantitative version of Weyl's theorem in our 
arguments; see Proposition 14.31 below. 
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We can iterate this theorem to establish a factorization result. We first need some 
notation. 

Definition 1.10 (Rational subgroup). Let G/Y be a nilmanifold. A rational subgroup 
of G is a closed connected subgroup G' of G such that G'Y/Y = G'/T' = G'/(G' n Y) is 
a closed submanifold of G/Y (or equivalently, that Y' is a cocompact subgroup of G'). 
We say that G' is proper if G' 7^ G. 

Example. If G/r is a nilmanifold (that is to say if there exists a uniform subgroup 
r ^ G) one can show that each member Gi of the lower central series is a rational 
subgroup; see e.g. jl] or [25]. o 

Definition 1.11 (Rational sequence). Let G/Y be a nilmanifold. A rational group 
element is any g e G such that g r e T for some integer r > 0. A rational point is any 
point in G/Y of the form 5T for some rational group element g. A sequence (g(n)Y) n€ z 
is rational if every element <?(n)r in the sequence is a rational point. 

Remark. It is not difficult to show that the rational group elements form a dense 
subgroup of G that contains T; see Lemma IA.11I We will show in Lemma [A. 121 that 
any polynomial sequence in G/Y which is rational is automatically periodic. 

Corollary 1.12 (Factorization theorem for polynomial sequences). Let (g(n)Y) nE z be a 
polynomial sequence in a nilmanifold G/Y. Then there exists a rational subgroup G' of 
G and a factorization g = eg'j, where e G G is a constant, g' : Z — > G' is a polynomial 
sequence such that (g'(n)Y') ne ^ is totally equidistributed in G'/Y' (where Y' :— G D Y), 
and 7 : Z — > G is a polynomial sequence such that the sequence (7(n)r) ng pj is rational 
(and hence, by Lemma {A.12\ (i). is periodic). 

Proof. We give a sketch of this argument only; we will repeat this argument in more 
detail when proving Theorem 11.191 below. 

We induct on the dimension m of G/Y, assuming that the claim has already been 
proven for all nilmanifolds of lesser dimension. By replacing g(n) with g(0) _1 g(n) if nec- 
essary (absorbing the g(0) factor into the e term) we may normalise so that g(0) = idc- 
If (g(n)Y) n& z is equidistributed on G/Y, then it is totally equidistributed by Leibman's 
theorem, and we are done (with g' — g, G' — G, and £,7 trivial). So we may assume 
that (g(n)Y) n( zz is not equidistributed. By Leibman's theorem, there exists a non-trivial 
horizontal character 77 : G — > R/Z such that nog is constant, in fact by our normalisation 
g(0) = idc we must have 77 o g = 0, thus g takes values in ker(r/). It is then not difficult 
to factorise g = go7o ; where 70 is a polynomial sequence with ( , yo(n)Y) ne z rational and 
periodic, and go is a polynomial sequence taking values in the proper rational subgroup 
G' ^ G, defined to be the connected component of ker(?7) which contains the origin. The 
claim then follows by applying the induction hypothesis to the sequence (go(n)Y') ne z in 
the nilmanifold G'/Y', which has dimension m — 1, and using the fact that the product 
of two rational group elements is again rational, as well as the trivial observation that 
rational group elements of G' are automatically rational group elements of G also. □ 

Remark. In words, this corollary asserts that in the qualitative setting, one can 
decompose 

(arbitrary polynomial sequence) = (constant) x (totally equidistributed) x (periodic). 
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An inspection of the proof reveals that one can in fact take the constant e to be g{0). 

As a corollary we obtain a Ratner-Shah type theorem for polynomial sequences in 
nilmanifolds, first established by Leibman [22J: 

Corollary 1.13 (Leibman's Ratner-Shah type theorem for nilmanifolds). Let (g(n)Y) ne z 
be a polynomial sequence in a nilmanifold G /Y . Then there exists a rational subgroup G' 
ofG, a group elements G G, and a rational periodic sequence (x n ) ne z inG/Y with some 
period q such that for every r G Z, the sequence (g(qn + r)Y) ne z is totally equidistributed 
in eG'x r . 

Remark. Shah [29J obtained a similar result for arbitrary discrete unipotent (but 
linear) flows on a finite volume homogeneous space; the case of continuous unipotent 
linear flows was treated earlier by Ratner [28] (see [5] for further discussion). Leibman's 
proof of Corollary 11.131 does not use these results, but instead proceeds in two stages. 
Firstly, by iterating Theorem 11.61 (or more precisely a generalization of this theorem 
to the case when G is not necessarily connected), a version of Corollary 11.131 for linear 
sequences is obtained. Secondly, by utilising a lifting trick of Furstenberg [7, p. 31], 
the polynomial case is deduced from the linear case. As we shall discuss shortly, these 
arguments do not work well in the quantitative case, and one must instead grapple with 
polynomial sequences directly. 

Quantitative equidistribution results. This paper stems from an attempt to 
establish quantitative versions of the above theorems for finite orbits. Unfortunately, 
the need for quantitative bounds on all aspects of these results forces us to introduce a 
substantial amount of new notation. 

Definition 1.14 (Asymptotic notation). We use Y = O(X) or F <C X to denote the 
estimate \Y\ ^ CX some absolute constant C. When we need to indicate dependence of 
C on various parameters, we shall indicate this by subscripts, thus for instance Od t m(X) 
denotes a quantity bounded in magnitude by C^ m X for some C d m depending only on 
the quantities d, m. 

Definition 1.15 (Circle norm). If x G R./Z, we use ||:e||r/z := dist(x,Z) to denote the 
distance of x to the origin (thus ||a(mod Z)|| R / Z = \a\ whenever —1/2 < a ^ 1/2). If 
x G K., we write ||o;||r/z for ||x(mod Z)|| K / Z . 

Our first main result is the following quantitative version of Theorem 11.91 Note that 
some of the terminology in this theorem will not be formally introduced until the next 
section, but this should not prevent the reader from gaining a rough appreciation of the 
statement. 

Theorem 1.16 (Quantitative Leibman theorem). Let m, d ^ 0, < 5 < 1/2, and N ^ 
1. LetG/Y be an m- dimensional nilmanifold together with a filtration G, of degree d and 
a ^-rational Mal'cev basis X adapted to this filtration. Suppose that g G poly(Z, G,). 
Then at least one of the following statements is true: 

(i) (g(n)Y) n£ [N] is 5 -equidistributed in G/Y. 

(ii) There exists a non-trivial horizontal character i] : G — > R/Z with \t)\ <C S^° m - d ^ 
such that \\n o g(n) — rj o g(n — 1)||r/z 5~° m ' d ^ /N for all n G [N). 
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Remarks. The notions of a "^-rational Mal'cev basis adapted to G." , of the modulus 
1 77 1 of a horizontal character and of the metric which is implicit in the notion of 8- 
equidistribution are technical and will be defined precisely in Definition I2.4[ Definition 
12. 6| and Definition 12.21 respectively. 

Theorem 1 1 . 1 6 1 assert s that the sequence (g{n)T) ne [j^ is either 5-equidistributed up to 
time N, or else it is very far from being equidistributed up to time 5 0m - d ^N, being 
concentrated very close to a union of 5~ 0m - d ^ subtori. One should view N as being 
very large compared to 1/5, otherwise the content of the proposition is trivial. It is 
not hard to deduce Theorem 11.91 from Theorem I1.16| we leave this to the reader as an 
exercise. 

For technical reasons it will be convenient later to strengthen the statement (ii) 
slightly, so as to also control higher "derivatives" d^(r}og); see the next section for more 
information. 

Whereas in the qualitative setting one always works in the limit N — > 00, in the 
quantitative setting one works with a fixed (but large) N. As N increases, there can 
be transitions in the behaviour of the finite sequence (g(n)T) n ^ N ], in which the equidis- 
tribution (or lack thereof) changes significantly (cf. the "coalescence of progressions" 
phenomenon j32j Chapter 12]); these transitions are a new feature of the quantitative 
setting, which are not readily visible in the qualitative one. We illustrate this with a 
simple example: 

Example. Consider the (additive) example G = R, T = Z and g(n) = (| + cr)n, where 
< cr ^ is a parameter. In this case we have m = d = 1. If N is much larger than 
1/er, we see that (g(n)(mod Z)) ng [jv] is 5-equidistributed. On the other hand, if N is 
much smaller than 1/er, we see that (g(n) (mod Z)) n6 [jv] fails to be 5-equidistributed, 
indeed it is highly concentrated around and 1/2 in this case. However, if we let r] : 
G — > K/Z be the non-trivial horizontal character rj(x) := 2x(mod Z) we see that r](g(n)) 
is slowly varying in the sense of (ii). The transitional regime when N is comparable 
to 1/er is interesting; there is enough irregularity to prevent 5-equidistribution on the 
sequence (g(n) (mod Z)) ne pv], but in order to obtain near-constancy of fj(g(n)) one in 
fact has to pass to shorter sequences such as (g(n)(mod Z)) ne [ C( sjv]- The need to work 
on a variety of different scales like this is very much a feature of additive combinatorics, 
particularly those parts of it that have the flavour of "quantitative ergodic theory" . The 
work of Bourgain [3] on Roth's theorem is another example. o 

Of course, by specialising to linear sequences, Theorem 11.161 also implies a quantita- 
tive version of Leon Green's theorem. The proof of Theorem 11.161 could be simplified 
somewhat in this case. Such a theorem is not especially useful, however. The following 
example may help to illustrate why, in the quantitative setting, the consideration of 
linear sequences leads naturally to the "polynomial" world. 

Example. (The skew torus) Let us consider the Heisenberg example ( II -ip once more, 
taking now 




where a := N 3 / 2 . Set 
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Translating to the fundamental domain, we obtain 

g(n)T = 



(Here, and for the rest of the paper, 
greatest integer less than or equal to x. 



1 {2na} {-n 2 a} 
1 
1 

we define {x} := x — [x\, where [^J is the 
The orbit {g{w)T) n& \m is certainly not close to 
equidistributed in G/T, and indeed the projected orbit (ii(g(n)T)) n€ ^ stays very close 
to the trivial subtorus T C R 2 /Z 2 which consists simply of the point {(0, 0)}. 

Now 7r _1 (T) is of course isomorphic to a one-dimensional torus R/Z. However the 
orbit (g(n)r) ne [Ar] does not approximate a linear orbit on this torus; rather, it has 
quadratic behaviour. Thus (g(n)T) n€ ^ is very close to (g , (n)T') nE ^ N j on G'/T' = R/Z, 
where 



G' : 
V 



1 R 
1 

1 

1 z 
1 
1 



9'(n):=(lTt)- (1.3) 



and 

Thus, in order to approximate the linear sequence (g(n)T) ne [N] by a lower-dimensional 
sequence, the latter sequence needs to be polynomial. Note however that if one had 
the luxury of passing from [N] to a much shorter progression, e.g. [iV 1 / 100 ], then the 
lower-dimensional sequence would remain linear. In the limit N — > oo, iV and jV 1 / 100 
both go to infinity, which may help explain why in the qualitative setting one can avoid 
polynomial sequences entirely and work purely in the category of linear sequences. 
Unfortunately, for the quantitative applications we have in mind (in particular, the 
number-theoretic application in [13]) we cannot afford to reduce the scale N in such a 
drastic mannei^. 

In much the same way that Theorem 11.91 could be iterated in order to establish Corol- 
lary 11.121 we can iterate Theorem 11.161 to obtain a quantitative factorization theorem. 
To state it we need quantitative versions of the "rationality" concepts of Definition II. Ill 
and also the new notion of smooth sequences, which must be introduced in place of 
constant sequences in the finitary setting. 

Definition 1.17 (Rational sequences, quantitative definitions). Let G/T be a nilmani- 
fold and let Q > be a parameter. We say that 7 G G is Q -rational if Y G T for some 
integer r, < r ^ Q. A Q-rational point is any point in G/T of the form 7r for some 
Q-rational group element 7. A sequence {^{n)) ne % is Q-rational if every element 7(n)T 
in the sequence is a Q-rational point. 

Definition 1.18 (Smooth sequences). Let G/T he a. nilmanifold with a Mal'cev basis 
X. Let (e(n)) ne z be a sequence in G, and let M, N ^ 1. We say that (e(n)) ne z 
is (M,N)-smooth if we have d(e(n), id G ) < M and d(e(n),e{n - 1)) < M/N for all 
n G [N], where the metric d = dx on G will be defined in Definition 12.21 



This is ultimately because it is known how to obtain non-trivial control on averages of number- 
theoretic functions such as the Mobius function fj, on intervals such as [N, N + N\og~ A N], but not in 
intervals such as [N,N + iV 1 / 100 ], even if one assumes strong hypotheses such as GRH. 



10 



BEN GREEN AND TERENCE TAO 



Note that the notion of a (M, iV)-smooth sequence collapses to that of a constant 
sequence in the limit N — > oo (holding M fixed). 

Theorem 1.19 (Factorization theorem). Let m, d ^ 0, and let Mq,N ^ 1 and A > 
be real numbers. Suppose that G/Y is an m- dimensional nilmanifold together with a 
filtration G, of degree d. Suppose that X is an M -rational Mal'cev basis X adapted to 
G. and that g G poly(Z,G.). Then there is an integer M with M Q < M < M ° A ' m ' d{1) , 
a rational subgroup G' C G, a Mal'cev basis X' for G'/Y' in which each element is 
an M -rational combination of the elements of X , and a decomposition g = eg'^j into 
polynomial sequences e, g', 7 G poly(Z, G m ) with the following properties: 

(i) e : Z ->■ G is (M, N) -smooth; 

(ii) g' : Z — > G' takes values in G' , and the finite sequence (g'(n)Y') n ^] is totally 
1 / 'M A -equidistributed in G'/Y', using the metric dx> on G'/Y'; 

(iii) 7 : Z — > G is M -rational, and ( r )(n)Y) n( zz is periodic with period at most M . 

Remark. In words, this corollary asserts that in the quantitative setting, one can 
decompose 

(arbitrary polynomial sequence) = (smooth) x (totally equidistributed) x (periodic). 

The notion of a subgroup G' being M-rational relative to a Mal'cev basis X will be 
defined in Definition 12.51 This result has some faint resemblance to the Szemeredi 
regularity lemma [30], although with the key difference that our bounds here are all 
polynomial in nature. 

The derivation of Theorem 11.191 from Theorem 11.161 will be performed in ^SlfTUl 

We will use Theorem 11.191 in [T3] in order to establish the Mobius and Nilsequences 
conjecture MN(s) from [T2] for arbitrary step s. For this application, it is important that 
all bounds here are only polynomial in M, and that the equidistribution is established 
on progressions of length linear in N (as opposed to N c for some small c > 0). 

Just as Corollary 11.121 implies a Ratner-type theorem, namely Corollary 11.131 it is 
not hard to deduce the following result from Theorem 11.191 

Corollary 1.20 (Ratner-type theorem for polynomial nilsequences). Let m,d ^ 0, 

< 5 < 1/2, and N ^ 1. Suppose that G/Y is an m- dimensional nilmanifold, that G, 
is a filtration of degree d on G, and that X is a 1/S-rational Mal'cev basis adapted to 
G,. Suppose that g e poly(Z, G m ). Then we may decompose [N] as a union PiU- ■ - UP/. 
of arithmetic progressions with length S° m ' d ^N and the same common difference q, 

1 ^ q ft-OmA 1 ) } such that each orbit (g(n)Y) n€ p i is within 5 (using the metric dx) of 
being equidistributed on XiG'y/Y /Y C G/Y, where Xi G G, yi G G is 5~° m ' d ^ -rational, 
and G' is a closed subgroup of G which is 5~° m ' d ^ -rational relative to X (this notion 
will be defined in the next section). 

Remark. The reader may wish to compare this with [B], another recent result on 
quantitative variants of Ratner's theorem. 

Let us conclude this introduction by remarking that our main theorem actually ap- 
plies to multiparameter polynomial mappings g : Z* — > G. In the infinitary setting such 
a generalization was obtained by Leibman [23], and his result has subsequently been 
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applied in such papers as [2 J and [24]. We have taken the trouble to derive multipa- 
rameter extensions of our main results with analogous fmitary applications in mind; see 
Theorems 18.61 and Theorem 110.21 

2. Precise statements of results 

In this section we define various "quantitative" concepts (such as Q-rational Mal'cev 
bases, subgroups which are Q-rational relative to such a basis and the metrics dx and 
dc/v) which were needed to properly state the main results from the introduction section. 
We also give a more precise version of Theorem I1.16[ which we will then spend the next 
several sections proving. 

Mal'cev bases and metrics on G/Y. The notion of Mal'cev coordinates play 
a vital role in the quantitative theory of nilmanifolds. They allow us to put a metric 
on G/Y, which in turn allows us to define the notion of equidistribution; they also 
quantify the "rationality" of various objects associated to the nilmanifold. Mal'cev 
coordinates were introduced in [25] , which contains a nice discussion; they are covered 
quite extensively in the book [4], particularly Chapters 1 and 5. We will also need 
several more quantitative statements about Mal'cev coordinates, which we have placed 
in Appendix [A] We recommend that the reader dip into that appendix as and when 
required. 

We will make use of the Lie algebra g of G together with the exponential map exp : 
g — > G. When G is a connected, simply-connected nilpotent Lie group the exponential 
map is a diffeomorphism; see [4J Theorem 1.2.1]. In particular, we have a logarithm 
map log : G — > g. One does not really need to have an understanding of the exponential 
and logarithm maps beyond some of their formal properties, which we will list as we 
need them, in order to understand this paper. 

Definition 2.1 (Mal'cev bases). Let G/Y be a m-dimensional nilmanifold and let G. 
be a filtration. A basis X = {Xx, . . . , X m } for the Lie algebra g over R is called a 
Mal'cev basis for G/Y adapted to G, if the following four conditions are satisfied: 

(i) For each j = 0, ... ,m — 1 the subspace \)j := Span(X, +1 , . . . ,X m ) is a Lie 
algebra ideal in g, and hence Hj := exp \)j is a normal Lie subgroup of G. 

(ii) For every ^ i ^ s we have Gi = H m _ mi (recall that m, = dimGj); 

(iii) Each g e G can be written uniquely as exp(tiXi) exp(t 2 X 2 ) . . . exp(t m X m ), for 
U e R; 

(iv) T consists precisely of those elements which, when written in the above form, 
have all £, G Z. 

Remarks. Our main results only make sense if the nilmanifold G/Y is already equipped 
with a Mal'cev basis X, since they involve quantitative dependencies that can only be 
described using such a basis. However it is a well-known result of Mal'cev [25] that any 
nilmanifold G/Y can be equipped with a Mal'cev basis adapted to the lower central 
series filtration. Indeed the very existence of a discrete and cocompact subgroup Y 
guarantees that the lower central series is rational by [4J Theorem 5.1.8 (a)] and (4J 
Corollary 5.2.2]. One may then apply (4J Proposition 5.3.2] to deduce the existence of 
a Mal'cev basis adapted to the lower central series. More generally there is a Mal'cev 
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basis adapted to any nitration G, which consists of rational subgroups (cf. Definition 

on}. 

We refer to the t { as the MaVcev coordinates of g, and we define the Mal'cev coordinate 
map t/> = ipx '■ G — > W n to be the map 

^((?):=(ti,...,t m ), (2.1) 

thus for instance T = ip -1 (Z m ) . If X' is another Mal'cev basis (relative to some filtra- 
tion) then we write ip' = ipx'- Only very occasionally will we need to use the notation 
ipy to indicate the coordinate map relative to some further basis 3^- 

Remarks. In the literature, Mal'cev coordinates are invariably discussed in the context 
of the lower central series filtration and are referred to as coordinates of the second kind. 
Coordinates of the first kind or exponential coordinates are derived by writing log g e g 
as a linear combination logg = s\Xi + . . . + s m X m of elements of the basis X, and 
we write ip e xp(g) = ipx,exp(g) '■= (si, • ■ • , s m ) for the coordinates of g obtained in this 
fashion. However, we shall mostly work using coordinates of the second kind. 

We can use a Mal'cev basis X to put a (slightly artificial) metric structure on G and 
onC7/r. 

Definition 2.2 (Metrics on G and G/T). Let G/Y be a nilmanifold with Mal'cev basis 
X. We define d = dx '■ G x G — > to be the largest metric such that d(x,y) ^ 
\ijj(xy~ l )\ for all x, y e G, where | ■ | denotes the £°°-norm on M. m . More explicitly, we 
have 

{71-1 
^min(|?/'(x i _ix 4 " 1 )|, \ip(xiXi\)\) : x , . . . , x n e G; x = x; x n = y 
i=0 

This descends to a metric on G/Y by setting 

d(xY, yY) := mf{d(x', y') : x', y' G G; x' = a;(mod Y);y' = y(mod Y)}. 

It turns out that this is indeed^ a metric on G/Y; this essentially follows from the 
discreteness of Y in G, and we will prove it in Lemma [A. 151 Since d is right-invariant, 
we also have 

d(xY, yY) = inf d(x,yj). 
7er 

When the letter d is used for a metric, it will always denote the metric dx relative to 
some basis X that is already under discussion. The symbol d! will be used for the metric 
defined using some other basis X' . On the very rare occasions (for example in the proof 
of Lemma I7.4p where the metric relative to some further basis is under consideration 
we will indicate this explicitly using subscripts. 

Quantitative rationality. Now we define the concept of rational nilmanifolds 
and subgroups. 



4 We note that this metric structure is a little more specific than in some of our previous papers, 
notably that in [121 §8]. This will not cause any difficulty, as the metrics in that paper are equivalent 
to the one given here, up to constants depending on G, T and X. Indeed, at small scales d agrees with 
the distance function given by the unique right-invariant Riemannian metric on G whose value at the 
origin is equal to that of the Euclidean metric at the origin of K m , pulled back by tp; see also Lemma 

El 
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Definition 2.3 (Height). The height of a real number x is defined as max(|a|, if 
x = a/6 is rational in reduced form, and oo if x is irrational. 

Definition 2.4 (Rationality of a basis). Let G/Y be a nilmanifold and Q > 0. We say 

that a Mal'cev basis X for G/r is Q-rational if all of the structure constants in the 
relations 



are rational with height at most Q. 

Definition 2.5 (Rational subgroups). Suppose that a nilmanifold G/Y is given together 
with a Mal'cev basis X = {Xi, . . . , X m }, and that Q > 0. Suppose that G 1 C G is a 
closed connected subgroup. We say that G' is Q-rational relative to X if the Lie algebra 
q' has a basis X' = {X[, . . . , X' m ,} consisting of linear combinations YliLi a iXi, where 
ai are rational numbers with height at most Q for all %. 

Definition 2.6 (Modulus of a horizontal character). Suppose that G/Y is a nilmanifold 
with a Mal'cev basis X. Suppose that r\ : G — > R/Z is a horizontal character, that is 
to say a homomorphism from G to R/Z which annihilates T. Then, when written 
in coordinates relative to X, properties (iii) and (iv) of Proposition 12.11 imply that 
rj(g) = k ■ ip(g) for some unique k G Z m . We write \r]\ := \k\. 

Smooth polynomial sequences. For technical reasons it will be convenient to 
quantify the smoothness of sequences, such as the sequence e(n) appearing in Theorem 
I1.19[ in a slightly different manner from that used so far. 

Definition 2.7 (Smoothness norms). Suppose that g : Z — > R/Z is a polynomial 
sequence of degree d. Then g may be written uniquely as 



where «j is in fact equal to d l g(0). For any iV > we define the smoothness norm 



The smoothness norm || • Hc 00 ^] is designed to capture the notion of a polynomial 
sequence which is slowly-varying. Indeed, the following lemma is easily verified: 

Lemma 2.8 (Smooth polynomials vary slowly). Let g : Z — > R/Z be a polynomial 
sequence of degree d, and let N > 0. Then for any n 6 [N] we have 



In view of this lemma, we see that Theorem II. 161 will be an immediate consequence 
of the following more precise statement. This is in fact the main technical result in our 
paper and we will use it to derive all our other main results. 

Theorem 2.9 (Quantitative Leibman theorem). Let m, d ^ 0, < 5 < 1/2 and 

N ^ 1. Suppose that G/Y is an m- dimensional nilmanifold together with a filtration G, 
and that X is a ^-rational Mal'cev basis adapted to G.. Suppose that g G poly(Z, G # ). 




k 




g\\c°°[N] ■= sup N 3 \\aj\\^/z. 



g(n) - g(n - 1)||m/z <d ttIMIc-^- 
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U (5'( n )r)ne[A r ] i s n °t S-equidistributed, then there is a horizontal character rj with < 
\r}\ <C fi-OmA 1 ) such that 

\\v ° g\\c°°[N] < 5-° mAt) . 

Notes ON reading the paper. As with so many papers, some parts of this work 
are merely technical and other parts represent deeper ideas of greater interest. There 
are quite a number of computations in this paper in which one has to show, say, that a 
certain integer is bounded polynomially by another, or that a certain basis is 0(5~°^)- 
rational. All such computations are of the technical variety and should certainly be 
ignored on a first reading. They are all in a sense "clear" ; their proofs proceed by algebra 
of a type which could hardly be expected to introduce non-polynomial dependencies. 
It is possible that this could even be encoded in some relatively soft "proof-theoretic" 
language, but we have chosen not to follow such a path. 

We begin with several sections containing motivating examples. In §3] we will discuss 
linear flows on tori M m /Z m , in §H] we shall discuss polynomial flows on M/Z, and in £J5] 
we will look at linear flows on the 2-step Heisenberg nilmanifold (II. ip . Some lemmas 
from these sections will be required in the sequel. 

We then begin the study of the general case. In §|6] we study the algebraic properties 
of polynomial sequences on nilpotent groups following Lazard and Leibman. There 
is a rich general theory here which is not evident from the study of the abelian and 
Heisenberg examples. 

We then turn to the full proof of Theorem I2.9[ the quantitative Leibman theorem. 
This is the technical heart of the paper and is given in the (rather long) £0 

In §8] use a straightforward iteration argument to bootstrap Theorem 12.91 to a multi- 
parameter version of itself, namely Theorem 18.61 In §|9] we then establish a preliminary 
multiparameter factorization theorem, Proposition 19. 2^ which is a fairly short conse- 
quence of Theorem 18.61 In §fTUl we then iterate this proposition, obtaining a multipa- 
rameter theorem (Theorem I10.2p which then easily implies Theorem 11.191 (and hence 
Corollary ll.20p as special cases. 

The appendix contains basic results on bases and nilmanifolds. 

There is unfortunately a large amount of notation in this paper. In Figure [1] the key 
objects in the argument are briefly described. 

3. A quantitative Kronecker theorem 

In this section we prove Theorem 12.91 for linear sequences on the torus M m /Z m , that 
is to say we establish a quantitative Kronecker theorem. The methods and the result 
are very standard. 

Proposition 3.1 (Quantitative Kronecker Theorem). Let m ^ 1, let < 5 < 1/2, 
and let a G M m . // the sequence (cm(mod Z m )) ne [7v] is not 5-equidistributed in the 
additive torus R m /Z m , then there exists k G Z m with < \k\ S~° m ^ such that 
• a|| R/z < (T^W/iV. 

Remark. We leave it to the reader to check that this really is the specialization 
of Theorem 12.91 to the case of linear orbits on the torus M m /Z m . This may be found 
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Figure 1. A list of key objects in the paper, together with brief descrip- 
tions of these objects, and the location where they are first defined or 
introduced. 



helpful in understanding some of our notation. Note in particular that in this case the 
horizontal torus is simply R m /Z m , and we may take tt to be the identity map. 

Proof. By Definition 11.21 there is a Lipschitz function F : R m /Z m — > R such that 

\E ne[N] F(an(mod Z m )) - f F d6\ > 5\\F\\ Lip . (3.1) 

Jr™ /z m 

At the expense of replacing 5 by 5/2 we may translate F, add a constant to it and rescale 
in such a way that J F = and ||.F||Lip = 1- By approximating F by smooth functions 
we may assume that F is smooth (we do this to avoid any technical issues regarding 
convergence of Fourier series). We now use a standard manoeuvre to approximate F by 
a function which has finite support in frequency space (cf. [Tlj Lemma A.9]). 

Consider the Fejer kernel K : R m /Z m — y R + defined by 

mes^j mes(y) 

where Q := [— j§^] m C R m /Z m is a small cube, and * denotes the usual convolution 
operation on the torus R m /Z m . It is immediate that K is a non- negative function 
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supported in Q with 

1 K = l. (3.2) 



/ 



A simple calculation also establishes the estimate 

\K(k)\ ^r^M- 1 (3.3) 

fceZ m :|fc|>M 

for all M > 1, where the Fourier coefficient is defined by 



K(k) := / K(6)e(-8 ■ k) d6 
and e(x) := e 2mx is the standard character on R/Z. We also have the crude bound 

|%)|^||F||oo^||F||Lip^l (3.4) 

for all k G Z m . 

Set F 1 := F * K. Since Hi^Hup — 1> an d -ft" is supported in Q and satisfies (13. 2p . a 
standard computation shows that 

IIF-Filloo ^ 5/8- 
Choose M := C m 5~ 2m_1 for some suitably large C m , and set 

F 2 (9):= Fi{k)e{k-9). 

k& m :0<\k\^M 

Noting that Fi(0) = 0, facts (13. 3p . (13. 4p and the Fourier inversion formula imply that 

\\F 1 -F 2 \\ 00 ^5/8. 

It follows that ||F — i^Hoo ^ 5/4, which means in view of the failure of (13. ip that 

\E ne[N] F 2 (naZ m )\ > 6/4. 
Applying (13. 4p once more we see that there is some k, < \k\ ^ M, such that 

\E ne[N] e{nk ■ a)\ > m 5M m > 5° mW . 
The result now follows immediately from the standard estimate 

N\\t\\ m ) ' 

which follows from summing the geometric progression. □ 

Let us now record a corollary of the m = 1 version of this result which will be 
used several times in the sequel. This gives stronger information in the case that 
(rza(mod Z)) n6 [jv] is very far from being equidistributed. 

Lemma 3.2 (Strongly recurrent linear functions are highly non-diophantine) . Let a G 

R, < 5 < 1/2, and < e ^ 5/2, and let I C R/Z be an interval of length e 
such that an G / for at least SN values of n G [N] . Then there is some k G Z with 
< |fc| < such that ||Hk/2 < e6-°W/N. 



E ne[N] e(nt)\ < min ( 1, 
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Proof. Taking F to be a Lipschitz approximation to the interval /, we see immediately 
that our assumption precludes (cm(mod Z)) ng [jv] from being <5 10 -equidistributed. It 
follows from the case m = 1 of Proposition 13.11 that there is some k G Z, \k\ <C 5~ c , 
such that ||fca|| R / z < 5~ C /N, where C = 0(1). Write (3 := \\ka\\ R / z . Let n G Z be 
arbitrary, and suppose that n! ranges over any interval of integers J of length at most 
1/(3. The number of n' for which a(riQ + /cn')Z G / is then at most 1 + e/(3. Since [iV] 
may be divided into ^ 2k + /3iV progressions of the form {uq + kn' : n' G J} we obtain 
from our assumption the inequality 

SN <: #{n G [iV] : cmZ G /} < (1 + -)(2k + (3N) < fc + — + (3N + eN. (3.5) 

Now the lemma is trivial if N <C 5 _10C and follows immediately from Proposition 13.11 
when e 3> <5 10C ', so suppose that neither of these is the case. Then all of the terms 
except the second on the right-hand side of (13. 5p are negligible, and we deduce that 

5N < ke/0. 

This immediately implies the result. □ 

The main idea in the proof of Proposition 13.11 of course, was that the space of 
Lipschitz functions is essentially spanned by the space of pure phase functions e(k ■ 6). 
Thus we were able to assert that if the condition (13. ip fails for some F, then it also fails 
(albeit with a smaller value of 5) for a pure phase function with not-too-large frequency. 

A similar observation turns out to be essential in the analysis of polynomial sequences 
on general nilmanifolds G/Y (cf. the proof of [221 Theorem 2.17]). Though we will not 
be discussing general sequences for quite a while, this does seem to be an appropriate 
place to state and prove a lemma which generalizes the observations just made. For 
this, we will be working primarily on the vertical torus: 

Definition 3.3 (Vertical torus). Suppose that G/Y is a nilmanifold and that G, is a 
filtration of degree d. Note that Gd then lies in the centre of G. We define the vertical 
torus to be Gd/(Y fl Gd), and the vertical dimension rrid to be rrid := dimG^; the last 
rrid coordinates of the Mal'cev coordinate map ip may be used to canonically identify 
G d and G d /(YC]G d ) with R md and W rid /Z md respectively. Also observe that the vertical 
torus acts canonically on the nilmanifold G/Y, thus we can defin(S 8y G G/Y for all 
9 G R md /Z m " and y G G/Y. 

Definition 3.4 (Vertical characters). A vertical character is a continuous homomor- 
phism £ : Gd — > M/Z such that r C\Gd Q ker£ (in particular, £ can also be meaningfully 
defined on Gd/Yd — MJ rid /Z md ). Any such character has the form £(x) = k ■ x for a 
unique k G Z md , where we identify Gd with M. md . We refer to k as the frequency of the 
character £, and |£| := \k\ as the frequency magnitude. For instance the trivial character 
£ = has frequency 0. 

Definition 3.5 (Vertical oscillation). Let F : G/Y — > C be a Lipschitz function and 
suppose that £ is a vertical character. We say that F has vertical oscillation £ if we 
have F(gd ■ x) = e(£(gd))F(x) for all g d G Gd and x G G/Y. 



Here we have a slight clash between the additive notation for the torus W nd /Z md and the multi- 
plicative notation for the group G. We hope this will not confuse the reader. 



18 



BEN GREEN AND TERENCE TAO 



The next definition is a repetition of Definition 11.2} except that we specialize to 
functions with a fixed vertical oscillation £. 

Definition 3.6 (Equidistribution along a vertical character). Let g : Z — > G be a 
polynomial sequence. We say that (g(n)Y) n€ ^ is 5-equidistributed along a vertical 
character £ if 

E ne[JV]J F(#(n)r) - / F <<5||F|| Lip 
./G/r 

for all Lipschitz functions F : G/r — > C with vertical oscillation £. 

The next lemma states that in order to check whether a sequence is equidistributed, 
it suffices to test that sequence against functions possessing a vertical oscillation. 

Lemma 3.7 (Vertical oscillation reduction). Let G/Y be a nilmanifold together with 
a filtration G, of degree d. Let be as above, and let < 5 < 1/2. Suppose that 
g : Z — )■ G is a polynomial sequence and that {g(n)T) ne \m is not 5-equidistributed. 
Then there is a vertical character £ with |£| <C 5^° rn di 1 ) such that (g(ri)Y) n€ m] is not 
5° m dW -equidistributed along the vertical oscillation £ . 

Proof. We merely sketch this, for the argument is little more than a repetition of that 
used to prove Proposition 13.11 We begin with the same reductions. That is, assuming 
the existence of an F : G/Y — > C such that 



E ne[N] F{g(n)Y) 



G/r 



> 5\\F\ 



Lip; 



(3.6) 



we weaken 5 to 5/2 and assume that f G , T F = 0, that ||-F||Lip = 1 and that F is smooth. 

Let K be the same Fejer-type kernel as before, and now take i*\ : G — > C to be the 
function obtained by convolving with K in each Gd/{Y D G d ) = MJ nd /Z md -fibre, that is 
to say 



F(6y)K(d)dd. 



•"■■< /Z m d 



Fourier expansion on 



where 



i md /Z md gives 



F 1 (y)= F A (y;k)K(k), 
F A (y;k) := [ F{6y)e{-k ■ 6)d6. 

J~R m d /Z m d 



Now for g d eG d = R md we have 



F A (g d y, k) = / F((6 + g d )y)e(-k ■ 9) d9 = e(k ■ g d )F A {y- f), 



thus each function F A (y; k) has vertical oscillation £, where £(x) := k ■ x is the vertical 
character with frequency k. 

Using exactly the same estimates as in the proof of Proposition I3.1[ we have \\F — 
F 2 \\oo ^ V 4 , where 

F 2 (y):= F A (y;k)K(k) 

k& m d-.\k\^Q 
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for some Q = C md 5~ 2nid ^ 1 . The rest of the argument proceeds exactly as before, and 
we see that if we take F(y) := F A (y; k) for suitable k G Z md , \k\ <C 5~° m <i( l \ we have 



E ne[N] F{g{n)T) — f F 
Jg/f 



> S o ^d0-)\\F\ 



hip- 



Thus (5'(w)r) ne [7v] is not 5 0m dW-equidistributed along the vertical character £, as de- 
sired. □ 



4. The van der Corput trick and polynomial flows on tori 

In the last section we introduced one important trick - the idea of decomposing 
a Lipschitz function into phases using Fourier analysis. In this section we introduce a 
second trick - namely, the use of van der Corput 's inequality - and use this trick to study 
polynomial sequences on tori R m /Z m . Although our language is somewhat different, 
this is really just a reprise of the standard theory of Weyl sums as used for instance in 
the study of Waring's problem (see, for example, [S3])- 

Lemma 4.1 (van der Corput inequality). Let N,H be positive integers and suppose 
that (a„)„ e nvi is a sequence of complex numbers. Extend (a n ) to all of Z by defining 
a n := when n ^ [N]. Then 



i |2 N + H 

|JE ne [jv]a n | ^ 



HN 

\h\^H 



Proof. We have 

H-l 



n -H<n^N h=0 

Thus, applying the Cauchy-Schwarz inequality, we have 



Ei2 1 I \ -v i2 

a A = jh\ 1^ 2^ an+h \ 



-H<n^N h=0 



^N + H ^ |2 

^ H 2 1^ \2^ an + h \ 

-H<n^N h=0 

N + H ^ 

- H 2 1^ 1^ 2^ an+han+h 'i 

-H<nsCAf h=0 h'=0 

which is equivalent to the right hand side of the claimed inequality. □ 

We will use the following simple (and rather crude) corollary of this, which we phrase 
in the contrapositive. 

Corollary 4.2 (van der Corput). Let N be a positive integer and suppose that (a n ) n( z[N] 
is a sequence of complex numbers with \a n \ ^ 1. Extend (a n ) to all of by defining 
a n := when n [N]. Suppose that < 5 < 1 and that 
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Then for at least 5 2 N/8 values of h £ [N] we have 

\^ne[N]0'n+h^\ ^ 5 2 /8. 

Proof. The result is vacuous if N ^ A/5 2 , so assume this is not the case. Suppose for 
a contradiction that the result is false. Apply Lemma [4.11 with H = N. Then it is easy 
to see that we have 




where we have used the trivial estimate \E ne \j^a n a n+ h\ ^ 1 for those h £ [N] such that 
\K n& [is[]a n a n+ h\ ^ 5 2 /8, of which there are no more than 5 2 N/8. Rearranging and using 
the fact that N > A/5 2 we see that this is a contradiction. □ 

The next proposition is the main result of this section, and is Theorem 12.91 in the case 
G = R, T = Z and with g : Z — > G an arbitrary polynomial. 

Proposition 4.3 (Weyl). Suppose that g : Z — > R is a polynomial of degree d, and let 

< 5 < 1/2. Then either (g(n)(mod Z)) ne [jvi is 5-equidistributed, or else there is an 
integer k, 1 ^ k < 5~° d(1) ? such that \\kg(mod Z)\\c°°[N] < <5~° d(1) . 

We will deduce this from the following, which is nothing but a reformulation of Weyl's 
exponential sum estimate (see e.g. [33J ) . 

Lemma 4.4 (Weyl's exponential sum estimate). Suppose that g : Z — )■ R is a polynomial 
of degree d with leading coefficient a d and that 

\^ne[N]e(g(n))\ ^ 5 

for some < 5 < 1/2. Then there is k £ Z ; <C 5 _0d ^ 1 - ) , snc/i £/ia£ 

||A;a d ||R/z<r 0d(1) /iV d . 

Proof. We proceed by induction on d, the result having been established in §|3] in the 
case d = 1. We may assume that iV > 5 d for some large C' d since the result is trivial 
otherwise. Applying van der Corput's estimate in the form of Corollary 14.21 we deduce 
that there are ^> 5 2 N values of h £ [N] such that 

\^ne[N]e(g(n + h) - g{n)) \ > 5 2 . 

For each such h, g(n + h) —g(n) is a polynomial with degree d—1 and leading coefficient 
hdctd- Thus by the induction hypothesis there is, for ^> 5 2 values of h £ [N], some 

1 ^ Qh t^ -0 ^ 1 ) such that we have 

\\hq h da d \\ m <5-°^/N d - 1 

for each of these values of h. Pigeonholing in the qh, this implies that there is q, 
1 < q < <5~° d(1) , such that 

ll^ll w -o d (i) < r^w/iv^ 1 

for > 5° d WiV values of h £ [AT]. Since N is so large, Lemma 13.21 may applied to 
conclude that there is q' <C 5~° d ^ such that 

\\qq'a d \\ m « 5-°^/N d . 
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Redefining q := qq', the result follows. □ 

Proof of Proposition \4-3[ In this proof we allow all implied constants to depend on 
d. Suppose that g : Z — > R is a polynomial sequence of degree d such that the orbit 
(g(n)Z) ng [jv] on R/Z is not 5-equidistributed. Expand g as a Taylor series 

= f j ) a <H h(^)ai + a (4.1) 



and suppose as a hypothesis for induction on r, ^ r < d, that we have shown that each 
of the coefficients otd,otd-i, ■ ■ ■ ,ad-r is nearly rational in the sense that ||<7a<i_i||iR/z <C 
5~°^' /N d ~ l for some q <C 5~°^ for z = 0, . . . ,r. (The implied constants in the 0() 
notation may increase with each induction step, but there are only d such steps, and 
we are allowing these constants to depend on d, so this is harmless.) The statement we 
are trying to prove, Proposition 14.31 is the case r — d — 1. 

Now by the argument used in proving Proposition 13.11 (or indeed by simply quoting 
Lemma I3TTI) . there is k £ Z, < \k\ <C 5"°^, such that 

\E ne[N] e(kg(n))\^>5°^. (4.2) 

The base case r = of the induction follows immediately from Lemma 14.41 Suppose 
now that we have established the result for some r, and wish to establish it for r + 1. 
Set 

g\n) := g(n) - \ \\oi d \d-r) ^ = (d - r - l) + " ' + a °' 

Set Q := qd\, and write a^-i = a-d-i/q + 0(S~°^ r> /N d ~ l ), i = 0, . . . , r for some integers 
Od-i- For any n £ Z for any n! £ Z we have 

g'(n + Qri) - g'(n ) = g(n + Qn') - g(n ) - i a d _, (^°^®™ J - \J^_ J 



Set AT' := \_8 c <iN\ for some suitably large C' d and suppose that n' £ [AT'] and also that 
|rio| ^ 2N . Then the last term here is 0(5 c 'd~ ^ 1 '). The first term is an integer, since 

("r)-(7)4(')C-^° (mod9) 

for all j ^ d. Thus we see that if n' £ [A 7 ] and |no| ^ 2 A" then 

</(no + Qn') - <7'(no) = <?(n + Qn') - <?(n ) + 0(^-° (1) )(mod Z). (4.3) 

Splitting [A/ - ] into progressions of common difference Q and length [N 1 ] plus a negligible 
error we see from (14.21) that there is no, |no| ^ 2A^, such that 

\E n , E[NI] e(kg(n + Qn'))\ >5°^. 

It follows from (Q that 

|E„, 6[jV , ]e (VK + Qn'))| 
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By Lemma 14.41 we see that the leading coefficient a' := kQ d r 1 ad- r -i/{d — r — 1)! 
of this polynomial is nearly rational in the sense that there is 1 ^ q' <C 5~°^ such 
that II^VUk/z < <^° (1) /N d ~ r ~ 1 . It follows that there is 1 < q" < such that 

||g"ad-r-i||]R/z <C 8~°^ x ' / N d ~ r ~ x . Setting q := qq" we now clearly have 1 ^ q <C <5~ 0(1 ) 
and also \\qa d _i\\ R/Ij < tf-^W/jV*-* for i = 0, . . . , r + 1. 

This concludes the proof of the inductive step and hence of the proposition. 

We will also need a "strong recurrence" result for polynomials g : Z — > R, generalizing 
the linear result, Lemma 13. 2[ that we obtained in the last section. This is in fact an 
easy deduction from the Proposition 14.31 and Lemma 13.21 

Lemma 4.5 (Strongly recurrent polynomials are highly non-diophantine). Let d ^ 0, 
and suppose that g : Z — > R is a polynomial sequence of degree d. Suppose that < 5 < 
1/2 and e ^ 5/2, that I C R/Z is an interval of length e, and that g(n)(mod Z) £ I 
for at least 5N values of n E [N]. Then there is a k E < \k\ 5~° d ( 1 \ such that 
||fc<7(modZ)|| c ~ [JV ] <e5-°*W. 

Proof. In this proof we allow all implied constants to depend on d. If e 3> 5 d for some 
large depending only on d then the result follows immediately from Proposition 14.31 
so assume this is not the case. Expand g in a Taylor series as in (14. ip . with coefficients 
«o, • • • , otd- It follows from the assumption that none of the polynomials Xg, A ^ 8/2e, is 
#°W-equidistributed on [N]. Thus by Prop osit ion 14 . 3 1 we have see that for each A ^ 5/2e 
there is q\ "C 5~°^ such that ||?AAo;i||]R/z <C 8~°^' /N l for i — 0, . . . , d. Pigeonholing 
in the possible values of q\ we see that there is q <C 6~°^ such that for ^> 5°( 1 '/e 
values of A ^ 5/2t we have | ] Agotf 1 1 r/z <C 5~°^/N % for each i = 0, . . . , d. It follows 
from Lemma l3~2l that for each i there is <C J -0 ^ 1 ) such that ||5ia»||R/z <C e5~ Cd /N l . 
Writing g := gi . . . q& we see that q <C t^ -01 - 1 ) and that ||gai||R/z <C e5~°^ /N l for all i. 
This concludes the proof of the proposition. □ 

5. The Heisenberg example 

In this section we discuss the first example which is not just a rephrasing of classical 
work on equidistribution, establishing Theorem 12 .91 for a linear sequence on the Heisen- 
berg nilmanifold ( II. ip . thus s = d = 2, and m = 3. Strictly speaking, this section is not 
necessary in order to prove Theorem 12.91 in the general case, however we present this 
"worked example" here in order to illustrate the key ideas of the main argument in a 
simplified model setting. (Also, a key computation in this setting, namely Proposition 
15. 3[ will be reused in the main argument.) As in the preceding section, the idea is to 
use van der Corput's inequality to reduce the problem to a simpler problem, and in 
particular to reduce to a "1-step" or "abelian" problem that can be treated by the tools 
of the previous section. This turns out to work, but it will take a certain amount of alge- 
braic manipulation to see the 1-step structure emerge from van der Corput's inequality 
applied to the 2-step Heisenberg situation. 

Let us begin with a brief tour of the Heisenberg example (11.11) . We have g = o o m , 
with the exponential map being given by 

/ x y\ fix y+jxz \ 



THE QUANTITATIVE BEHAVIOUR OF POLYNOMIAL ORBITS ON NILMANIFOLDS 23 

and the logarithm map by 

1 x y\ /Ox y—^xz ' 



/ 1 x y\ (Ox y-^xz\ 

log \ U f J = I 2 J • 

\ o o i / \oo o / 

Observe that logT is not quite a lattice in M 3 , although it is a finite union of lattices. 

Consider the elements X 1 ,X 2 ,X 3 G defined by X\ := o o V X 2 := f o jj lj and 

X3 := f q V It is easy to see that X = {Xi, X 2 , X3} is a Mal'cev basis adapted to 
the lower central series filtration G 9 . A simple computation confirms that 

expfoXi) exp(t 2 X 2 ) exp(t 3 X 3 ) = ( *i tl *| 2 f * 3 J ; 

and so the Mal'cev coordinate map ipx '■ G — > M 3 is given by 

ipx [III) = {x,z,y-xz). 

The horizontal torus is isomorphic to (IR/Z) 2 , and the projection 71 : G — > (R/Z) 2 is 



, ( 1 x v 
given by 7r 01* 
Vo 1 



lx,z) 



We shall be working through the special case of Theorem 12.91 in the case when g : 
Z — > G is a linear sequence. To simplify the exposition very slightly we will assume 
that this sequence has no constant term, thus g(n) = a n for some a G G. Note that 
g G poly(Z, G u ), where G, is the lower central series filtration. Thus the sequence g has 
degree 2. 

Proposition 5.1 (Main theorem, Heisenberg case). Let G/T be the 2-step Heisenberg 
nilmanifold with the Mal'cev basis X described above, and let g : Z — )■ G be a linear 
sequence of the form g(n) = a n . Let 5 > be a parameter and let N ^ 1 be an integer. 
Then either (g(n)T) ne ^ is 5 -equidistributed, or else there is a horizontal character n 
with < < <^ 0(1) such that ||»7(a)|| tt /z < <5~° (1) /iV. 

Remark. Note that, since g(n) is linear, the last condition here is equivalent to the 
statement that \\rj o 1 1 [iv] <^ 6~ ^\ 



Proof. By Lemma [3.71 we may assume that there is a function F : G/T — > C with a 
vertical oscillation £ with ||£|| <C 5~ 0(1 ), and ||-F||Li P = 1, such that 

E ne[N] F(a n T) - [ F (5.1) 
Jg/t 

We split into two cases: £ = and £ ^ 0. 

If £ = 0, then F is (^-invariant, which means we may factor through ix to get a 
function F : M 2 /Z 2 ->■ C defined by 

F(x) = F(tt(x)). 

It is clear that ||-F||Lip ^ 1- Equation ( 15. ip implies that 



\E ne[N] F(mr(a)) - [ F\ > ||F|| Lip . 

Jr 2 /z 2 



Proposition 15.11 in this case now follows immediately from Proposition 13.11 Note how 
the GVinvariance allowed us to reduce a 2-step problem into a 1-step one. 
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Suppose then that £ ^ 0. The integral of F over every translate of G 2 /(r D G 2 ) is 
then zero, and hence J G ^ r F = 0. Thus 05. 1H becomes 

\E n€[N] F(a n T)\^5°^. 

We now come to one of the key ideas of the proof, which is to apply the van der Corput 
lemma, Corollary 14.21 This tells us that there are 3> 8°^N values of h G [N] such that 

\E n€[N] F(a n+h T)F(^T)\ » 5°^. (5.2) 

It is very natural to try and interpret this in terms of a nilsequence on the product 
nilmanifold G 2 /T 2 . To do this we first observe by direct computation that any x G G 
may be factored uniquely as where ip({x}) G [0, l) 3 and [x] G T. 

Let us, then, factor a h = {a h }[a h }. The inequality ( 15. 2 p implies that 

\E ne[N] F(a n {a h }T)Fj^T)\ » 8°™ 

for > 6°V)N values of h. This can be rewritten as 

\E ne[N] F h (a n h T 2 )\^>5°V (5.3) 

for 3> 8°^N values of h, where F h : G 2 /T 2 — > C is given by 



F h (x,y) := F({a h }x)F(y) 

and the element dh is given by 

d h := ({a! 1 }' 1 a{a h ] , a). 

At first sight, the estimates (15. 3p do not appear much better than our original estimate 
(15. ip ; indeed, it seems "worse" since we are now working on a 6-dimensional 2-step 
nilmanifold rather than a 3-dimensional 2-step one. 

The crucial observation, however, is that all the elements dh in fact lie not just in G 2 , 
but in the smaller group 

G D = Gx G2 G:={( M '):fVeG 2 }. 

This is also a 2-step nilpotent, connected, simply connected Lie group (of dimension 4). 
It is not hard to check that [G a , G D ] is the diagonal group := {(#2, Q-i) '■ 92 G G2}, and 
that one can take for a Mal'cev basis of G D /T D the collection X u = {X°, X 2 n , X 3 D , X^} 
given by 

□ /0 1{0,0}\ n /0 0{0,0}\ n /0 0{1,0}\ n /0 0{1,1} 

A, = 00 I, A 2 = 00 1 , A, = 00 ) and A 4 = 
Voo / Voo / Voo / Voo 



where we have written 



Ox{y,y'}\ Oxy\ ( x y' 

oo z := o o 2 , o o 2 
oo o / VVooo/ Vooo 



This allows us to identify the horizontal torus of G n /T n with M 3 /Z 3 by projecting onto 
the first three coordinates. 

Now ( 15. 3 p implies that for 3> 8°^N values of h we have 

|En e[ iv]^ n ((a°rr D )|»5 «, (5.4) 

where F° and a° are the restrictions of F h and dh to G n , and T n := T x rnG2 T. By 
inspecting the action of G\ on F° (and the hypothesis £ ^ 0) we also conclude that 
J G n/ T n F h = 0. 
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Now, the group G u is still 2-step nilpotent, so we do not appear to have reduced to 
a 1-step situation yet. However, recall that F has vertical oscillation £. Using this and 
the fact that g 2 is central in G, we obtain 

Fh((92,92) ■ (9,9')) = F{{a h }g 2 g)Fj^)= ^fe)W({a A } 3 )f(^ = F*((g,g')). 

Thus F^ is [G n , G n ] -invariant. In ( 15. 4p we may therefore factor through the projection 
7r n to obtain 

\E n€[N] F h (mr D (a h ))\^5°W 

for ^> 5°^N values of h, where the function F^ : R 3 /Z 3 — > C is defined by 

F h (n n (x)) = F*(xY a ). 

We leave it to the reader to check that ||i*/i||Lip = 0(1) (in the general case to follow 
this computation is given in more detail). Since F^ has mean zero, we see that F^ has 
mean zero also. 

We are now finally in a situation in which we may apply "1-step" tools. Indeed, from 
Proposition 13. II we see that for each h there is some e Z 3 , \k^ \ <C 5~~°^ such that 

\\^-7i a (a h )\\ m ^S-°^/N. 

Pigeonholing in h, we may assume that = k u is independent of h. Define 77 : G n — > 

R/Z by 

rj{x) := k D -ir n {x). 

Then 77 is an additive homomorphism which annihilates [G n , G n ] and r n , and we have 

||r7(a,)|| R/z «r°( 1 ViV (5.5) 

for > 8°P)N values of h e [N]. 

Our task now is to "piece together" these pieces of information for many different 
h to deduce Proposition 15.11 We begin by factoring the character 77 on G u into two 
simpler components, which originate from G (or G 2 ) rather than G u . 

Lemma 5.2 (Decomposition of 77). There exist horizontal characters r\\ : G — > R/Z and 
V2 '■ G2 — > R/Z on on G and G2 respectively (thus 771 annihilates Y and 77 2 annihilates 
T fl G2) such that 

ri{g',g) = vi(j9) + m{g'g~ 1 ) (5-6) 

for all (g,g r ) E G a . Furthermore we have \rji\, |t7 2 | <C 5^°^. 

Proof. Since n is an additive homomorphism we have 77^', g) = ^((g'g" 1 , 1) ■ (g, g)) = 
v(g,9) + v(g'g~ 1 A)- Thus if we define 771 (g) := r}(g, g) and 772(^2) := ??(^2, id G ) then ($3^ 
is immediately seen to hold. Now 771 is a horizontal character because 77 annihilates T D , 
which contains T A . Furthermore r D also contains (rnGy xidc and hence 772 annihilates 
r fl G2 as claimed. The bounds on \r]i\ and 1 772 1 are left as an exercise to the reader; 
one may compute explicitly with the Mal'cev bases X u and X on G n /Y n and G/Y 
respectively. □ 

Using this decomposition and the fact that, in the Heisenberg group, we have the 
identity x~ x yxy~ x = [x,y] since [x,y] is central, we see that 

r){a h ) = Vi( a ) + m([ a > i ah }})- 
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Now a straightforward computation with matrices confirms that if ip(x) = {ti,t 2 ,t 3 ) 
and ip(y) = (ui,u 2 ,u 3 ) then i()([x,y]) = (0, 0, t\U 2 — t 2 ui), and also that if ip(a) = 
(Ti>72, *) then ip({a h }) = ({71/1}, {^h}, *), where we do not care about the values of 
the coordinates marked with an asterisk *. Thus if we write 7 := (71,72) = 7r(a) and 
C := (-72,7i) then 

7](a h ) = ki ■ 7 + k 2 ( ■ {7/1}, 

where ki,k 2 = 0(S~ ^') are the frequencies of 771,772 respectively. Thus if (15. 5p holds 
then 

||*i • 7 + ^< ■ {tMIIk/z « S- 0(1) /N (5.7) 
for ^> 5°( l 'N values of h. The next proposition derives diophantine information con- 
cerning 7 and ( from a hypothesis such as this. In fact we handle a slightly more general 
situation, since this will be useful when we come to handle the general case of Theorem 
12.91 In the following proposition we shall take a = and m = 2; the proof when a = 
is actually considerably shorter and the reader may care to work through that case to 
better understand the argument. 

Proposition 5.3 (Bracket polynomial lemma). Let 5 G (0, 1) and let N ^ 1 be an 

integer. Suppose that a, (3 G R and that \a\ ^ 1/5N. Suppose that 7 G R m /Z m and that 
( G R m satisfies \(\ ^ 1/5. Suppose that for at least SN values of h G [N] we have 

\\^ + ah + C-{lh}\\ m ^l/5N. (5.8) 

Then either \Q\ <C m 5~° m ^ /N for all 1 ^ % ^ m, or else there is some k G Z m , 
1*1 <m 5-° m ^\ such that \\k ■ j\\ R/z < m 5-° m ^/N. 

Proof. If supj 101 ^ l/SN then we are done, so assume this is not the case. Then the 
assumption implies that \\(3 + ah\\^./z ^ (1 + m) sup^ \Q\ for ^ 5N values of h G [N]. 
Then Lemma I3T21 implies that there is q <C 5~ c such that ||ga||nyz <C m sup^ \Q\5~ C /N 
for some absolute constant C > 0. Since we are assuming that |a| ^ 1/5N this forces 
us to conclude that in fact |a| <C m supj \Q\8~ C /N unless N <C m <5 _ ° ( - 1 \ in which case 
the result is trivial in any case. 

Split [N] into intervals of length between N' and 2N', where N' := c m 5 c+1 N and 
c m > is a small number to be chosen later. By the pigeonhole principle, we can find 
one of these intervals / in which there are ^ 8\I\ values of h such that (15. 8p holds. If 
c m is chosen sufficiently small then ah does not vary by more than sup^ \Q\ on such 
an interval, and we conclude that there is 9 such that 

H0 + C-{7MllR/z<2oSup|Ci| + ^ 

for at least S\I\ values of h G I. Now if sup^ \Q\ ^ -M* then the proposition holds, so 
we may assume that this is not the case, in which eventuality we have 

P + ( ■ {lh}\\ m ^ ^\&\ (5.9) 
for some i G [to] and for at least 5\I\ values of h G I. We then set 

H:= {tG R m /Z m : \\9 + C • Wlk/z < } 
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and 

fi := {x G R m /Z m : dist(a;,fi) < 5/10}. 
For fixed u G R m /Z m the slice 

{t G Q : tj = Uj for j ^ i} 

is a union of intervals of length less than 5/2, and so vol(fi) ^ 5/2. Let F : R m /Z m — > 
M. + be the function 

„, , / 10distOr.fi) 
F(x) : = max ( 1 ^ V - ; , 

Then F = 1 on fi and so our assumption implies that 

E ne/ F( 7 n) ^ 5. (5.10) 
On the other hand F is supported on fi and so 

/ F{x) dx ^ vol(fi) ^ -. (5.11) 

</ R m /z m 2 



Thus of course 



|E n6/ F( 7 n) - / F(z) dx| ^ -. 

jR m /Z m ^ 



l /2 

However F has been constructed so that ||-F|| L ip 1/5 (we leave this as an exercise) 
and so we conclude that {^n)^ is not c5 2 -equidistributed. Applying Proposition 13. II we 
conclude that there is 1 < k < S' ^ such that ||fc-7|| R /z < 5~° m ^/N' < (J" " 1 ^)/^, 
and the claim follows. □ 



Recall that in our efforts to prove Proposition 15.11 had established the condition 
(15. 7p . Applying Proposition 15.31 and recalling that 7 = (71,72) and £ = (—72,71) 
we see that in all cases there is some nonzero k' G Z 2 with \k'\ 5~ 0( -^ such that 
\\k' ■ 7||r/ Z < 5-°W/N, that is to say ||Jfc' • 7t(o)||h/z < 5 _0(1) / Ar - This concludes the 



proof of Proposition 15.11 □ 

Let us pause for a moment to consider the form of the argument just presented. There 
were two places where we reduced matters to a simpler situation. First of all in the case 
£ = we were able to consider F as a function on a 1-step nilmanifold. Secondly when 
we applied the van der Corput trick we found ourselves with a function F^ which had 
as a vertical frequency, and so we were again able to reduce to the 1-step case, although 
we had to restrict the ambient nilmanifold (from G 2 /V 2 to G D /r n ) and also quotient 
out by a commutator group [G n , G a ] before the 1-step structure became manifest. This 
already makes it clear that some kind of induction is going on, and in the general case 
we will see this quite clearly. 



6. Polynomial sequences in nilpotent groups 



Our analysis of linear sequences on the Heisenberg example captured much of the 
essence of the proof of Theorem 12.91 in general. What it did not reveal, however, was 
the rather subtle structure of the space of polynomial sequences g : Z — > G. In this 
section we begin by establishing a remarkable result of Lazard [12] , which asserts that 
poly(Z, G,) is a group for any filtration G,. Lazard's proof uses the Lie algebra q and 
it works if G is a connected and simply-connected Lie group (as in the present paper). 
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However it turns out that the result is true with no topological assumptions on G, and 
indeed in the greater generality of so-called polynomial mappings from H to G, where 
H is an arbitrary group. This result is due to Leibman [2TJ (see also [2U] for a proof of 
the special case H = Z). 

We will then use the Lazard-Leibman results to derive sundry further results con- 
cerning the representation of elements of poly(Z, G m ) in coordinates. In fact, keeping 
in mind our intention to prove multiparameter results in £JHJ we develop the theory of 
polynomial maps poly(Z*, G,). 

Definition 6.1 (Polynomial maps). Let if be a group and let G be a nilpotent group 
with a filtration G,. If g : H — > G is a map and if h G H we write d^g for the map 
defined by dtg(x) = g^xfyg^x) -1 . We say that g is a polynomial map with coefficients 
in G, if we have ■ ■ ■ df ll g(x) G G{ for all choices of % and for all hi, . . . , hi G H and 
x G G. We write poly(if , G,) for the collection of all such mappings. If g : H — ^ G 
is a map we say that g is a polynomial sequence of degree at most d if there exists a 
filtration G, of degree at most d such that g has coefficients in G 9 . 

Proposition 6.2 (Lazard-Leibman theorem [21]). Let H be a group, letG be a nilpotent 
group, and let G, be a filtration. Then poly(if, G m ), the space of polynomial maps 
g : H — >■ G having coefficients in G 9 , is a group. 

Remarks. This result is contained in (21] (although the result is only stated in the case 
that G, is the lower central series filtration, the proof does not use this fact). Our proof 
is a little different, relying on the machinery of Host-Kra cube groups. These featured 
for the first time in [T5J §5, §11] and were discussed subsequently in [TSJ Appendix E]. 
See also the recent preprint [T7]. We thank Sasha Leibman for helpful conversations 
concerning these methods. 

One should mention at this point the Hall-Petresco theorem [El [27], which established 
a special case of the Lazard-Leibman theorem. This theorem states that if G, is the 
lower central series filtration then the sequence n h-> a n b n lies in poly(Z, G m ) for any 
a,b G G. 

In this section it is convenient to generalise the notion of a filtration somewhat. By 
a prefiltration G, on a nilpotent group G we mean a sequence 

G^GoDdD-OG^ {id G } 

of subgroups with the property that [G^ Gj] C G i+ j for all i, j ^ 0. The only difference 
between a prefiltration and a filtration (cf. Definition 11.11) is that we no longer require 
that G = Gq = G\. The definition of poly(if, G,) extends in a completely obvious way 
to prefiltrations. 

For each integer k ^ we are going to define the Host-Kra cube group HK fc (G.) 
associated to the prefiltration G,. This will be a subgroup of G^\ the product of 2 k 
copies of G indexed by the cube {0, l} fc . Before giving the definition, we need to set up 
some nomenclature concerning these cubes. 

Each element to G {0, l} h corresponds in an obvious way to a subset of [k], and we 
write uj C u' when the corresponding sets are nested. An upper face F is a subset of 
{0, l} k of the form F(u) ) := {uj G {0, l} fc : uj D uj }. There are, of course, 2 k upper 
faces, one for each ujq G {0, l} fc . The codimension codim(F) of F is simply the number 
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of ones in u . Note that if F, F' are two upper faces then F n F' is also an upper face, 
and codim(F n F') ^ codim(F) + codim(F'). 

Given an upper face F and an element x £ G we write x F for the element of G^ 0,1 ^ 
defined by 

(x F ) = { X ' liueF 
^ )w y idc otherwise. 

Write G(p) f° r the subgroup of G^ ' 1 ^ consisting of all elements x F with x G G co di m (F)! 
where Gj is the ith group in the prefiltration G.; we call such a group an upper face 
group. 

Definition 6.3 (Host-Kra cube group). Let G, be a prefiltration on a nilpotent group 
G, and let k ^ be an integer. Then the Host-Kra cube group HK fc (G,) is the subgroup 
of G^ ' 1 } generated by the upper face groups G(p)- 

The Host-Kra cube group can, it turns out, be described in a rather explicit way. 
Write -< for the reverse lexicographic ordering on {0, l} h , thus u -< u' if an only if 
there is some j such that u>j < u>j and u>i — bj\ for i — j + 1, . . . , k. This induces an 
ordering on the upper faces F. We write F(u) >- -F(w') if and only if u -< u'. Let 
F -< F-i -< ■ ■ ■ -< F 2 fc_ 1 be the complete list of upper faces in this order; thus F = {l k } 
and F 2fc „! = {0,l} fc . 

Lemma 6.4 (Description of Host-Kra cube group). We have 

HK fe (G.) = G (Fo) • G (Fl) ■ ... G(f 2fc _ 1 ). 

T/iat is, every element o/HK fc (G) may oe written as Jq° . . - ^ F ^-\ where 7, e G co di m (Fi)- 
TTie representation is in fact unique. 

Proof. The key point here is the inclusion 

[G( F ), C G( F nF')- (6-1) 

This follows immediately from the fact that 

[G co dim(F), G co dim(F')] Q C co dim(F)+codim(F') Q C co dim(FnF') • 

Using this fact repeatedly, we shift all elements coming from G7p ) to the left. We then 
shift all elements coming from G( Fl ) to the left, and so on. We leave the routine details 
and the proof that the representation is unique (which we do not actually need) to the 
reader. □ 

Host-Kra cube groups and polynomial maps. It is now time to develop the 
link between Host-Kra cube groups HK fc (G,) and polynomial maps g £ poly(H, G # ). 
To do this we introduce the notion of a parallelepiped on H . This is an element in 
#{o,i} Q f the form (x/i w ) a;e {o,i}fc, where x E H, h = (hi, . . . , h^) is a fc-tuple of el- 
ements of H, and h u := h^ 1 . . . . For example the tuple (x,xhi,xh 2 ,xhih 2 ) is a 
parallelepiped in H^ ' 1 ^ , and (x, xhi, xh 2 , xhih 2 , xhs, xhih 2 , xh 2 hs, xhih 2 hj) is a paral- 
lelepiped in H^ ' 1 ^ 3 . Write for the set of parallelepipeds in H^ ' 1 ^ (if H is abelian 
H l fe l is actually a group, but this need not be the case in general and in any case is not 
important here). 
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Suppose that g : H — > G is a map. Then for any k ^ there is an obvious induced 
map g^" : H^ k ^ G^" . 

Proposition 6.5 (Characterization of polynomial maps). Suppose that H is a group, 
that G is a nilpotent group together with a prefiltration G,, and that g : H — ^ G. Then 
g lies in poly (if, G,) if and only if g^ ' 1 ^ maps to BK k (G.) for all k ^ 0. 

Remark. The reader might find it useful, as an exercise to get to grips with the 
notation, to verify this in the case H = G and g being the identity mapping. 

We note that Proposition 16 . 21 is an immediate consequence of Proposition 16.51 Indeed 
if g^ k and gtW both map to BK k (G.) then so does (gg) {0 ' 1}k , since BK k (G.) 
is a group. 

Proof of Proposition \6.5[ We start by establishing the only if direction of the propo- 
sition, proving by induction on k that g^< l ^ k does indeed map #W to HK*(G.) when 
g G poly(H, G,). This is clear when k = 0. Suppose it is known for a given value of 
k ^ 0. If X is a set, we may regard X^ ' 1 ^ +1 as a product of two copies of X^ 0,1 ^ , 
the first factor corresponding to those u with Uk+i = and the second to those u with 
Uk+i = 1- With this notation, every z G H^ k+ ^ may be written z = (z,zhk+i), where 
z := {xh UJ ) uje { ^k. We may factor g{°' 1 } +1 (z) as a product of two elements, namely 

g^ k+ \~z) = (id { ^\(d hk+1 g)^ k (z)) ■ (g^ k (z),g^ k (z)). (6.2) 

By the inductive hypothesis we have g^ ' 1 ^ (z) G HK fc (G.). The derivative g : H — >■ 

G is a polynomial map with coefficients in the prefiltration G, defined by G i := Gj+i 
(note that this is a prefiltration, since 



[G{, G j] — [Gi+i, Gj+i] c Gi+j+2 Q Gi+j+i — Gi + j t 



By a second application of the inductive hypothesis we therefore have {dh k+1 g)^ 0,1 ^ k (z) G 



WK k (G,). In view of (16. 2p it therefore suffices to show the inclusions 

HK fc (G.) A C HK fe+1 (G.) 
(where HK fc (G.) A is the diagonal subgroup {(t,t) : t G HK fc (G.)}) and 

idf 1]k x HK fc (G~.) C BK k+ \G.). 

To check the first inclusion it suffices to check elements (7 F ,7 F ) where 7 G G co dim(F)- 
But it is easy to see that (7 F ,7 F ) = 7 F inside G^ ' 1 ^ +1 , where the codimension of the 
face .F inside {0, equals codim(F), and the inclusion follows. To check the second 
inclusion it suffices to check elements (id^' 1 ^ ,7^) where 7 G G C odim(F) = G cot n m (F)+i- 
But it is again easy to see that (id^ 1 ' 1 ^ , r ) F ) = 7^, where now the codimension of 
F inside {0, l} k+1 is codim(F) + 1. This concludes the proof of the only if part of 
Proposition 16. 5( the perceptive reader will have noticed that we have not yet made any 
essential use of the main property of prefiltrations, namely the nesting property that 
[Gi, Gj] C Gi + j. 

We turn now to the proof of the if direction of the proposition. We are to show 
that if g^°^ h maps #M to HK fc (G.) for all k, then g G poly (#,<?.). Pick an element 
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z = (^^)we{o,i} fe i n H^. By Lemma EH (which does use the nesting property of G m ) 
we may write 




One may check that the rji enjoy the following support properties: (r]i) u = idc unless 
Ui + \, . . . , uik are all nonzero, and (r]i) u = (rji)^ if oj,u' differ only in the Ui coordi- 
nate. One may now examine (16. 3 j) coordinatewise, peeling off 77^,^-1, ... in turn, to 
eventually conclude that 



70 = d hl . ..d hk g(x). 

Now we know that 70 G Gcodim(F ) = Gk, and thus we have proved that ■ ■ ■ dh k g takes 



Polynomial maps in coordinates. From now on we specialise to the case of 
polynomial maps from Z* to G and revert to dealing with nitrations as opposed to 
prefiltrations. Our aim in this section is to describe the elements of poly(Z*, G.) using 
the Mal'cev coordinate map ip : G — > W 71 relative to some Mal'cev basis X for G/T 
adapted to the filtration G,. 

Definition 6.6 (Multi-binomial coefficients). Let t ^ 1 be an integer. Suppose that 
n = (ni, . . . , n t ) and that j = (j 1; . . . ,j t ) G Z> is a set of indices. Then we write 



A version of the following lemma may be found in [2^| §4]. 

Lemma 6.7 (Description of poly(Z*,G.) in bases). Suppose that G/T is a nilmanifold 
of dimension m and that X is a Mal'cev basis for G/T adapted to some filtration G m . 
Then g G poly(Z*, G,) if and only if the coordinates ip(g{n)) have the form 



Remark. The presence of the discrete subgroup T is not at all relevant to this lemma; 
however we have only defined Mal'cev bases in this context. 

Proof. We start with the if direction. If g(n) has the form stated then it is a product 
of sequences of the form n h> where a G G,j<. By the group property of poly(Z*, G,) 
it therefore suffices to establish the result in the case that g(n) is actually equal to such 
a sequence. By induction one sees that the derivative du x ■ ■ ■ dh k g(n) equals a p ( hl >-> hk > n \ 
where the maximal degree ctx + ■ ■ ■ + at of a monomial n" 1 . . . n^ 1 appearing in p is at 
most max(|j| — k, 0). Thus we see that this derivative lies in Ga\ if k ^ and is zero 
otherwise. It follows that g G poly(Z*, G,). 



values in Gk, as required. 



□ 





where eachtj lies inM. m and is such that (tj)i = 0ifi^ m—mg\, where \j\ := ■ -+jt- 
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To prove the only if direction, let f)j C Q be the subspace 

t)j := Span(X j+ i, . . .,X m ) 

and set Hj := exp(f) 3 ). By the nesting property of the Mal'cev basis X (see (lA.ip ) we 
see that Hj < G. 

Suppose as a hypothesis for downward induction on k that the statement has been 
proved for all g G poly(Z 4 , G 9 ) with g(n) G H^ for all n, for a certain value of k. This 
is trivial for k = m, in which case g(n) = idg. Suppose that g(n) G Hk_i for all n. Let 
7i : i/fe-i —7- Hk_i/Hk = M. be the natural projection. Then Pk-i{n) := 7r(g(fl)r) is a 
polynomial map from R* to R. Suppose that fc— 1 < m-mj, and that i is minimal subject 
to this property. Then for any hi, . . . , hi G Z* we have . . . <9/j 4 g G G; = H m _ mi , and 
therefore . . . d^Pk-iin) = 0. Thus the total degree of any monomial in pk-i is at 
most i — 1. Therefore we may write the sequence /i(ri) defined by 

:= exp(X fc _ 1 ) Pfc ~ l( " ) 

as a product of sequences exp(Xfc_i) with \j\ ^ i — 1. By the minimality of i we 
have Xk-i G 0i-i, and so each of these sequences lies in poly(Z*, G,), and hence so does 
h. It follows that the sequence g(n) := (7(n)/i(n) _1 lies in poly(Z*, G,). But this new 
sequence g has g{n) G i?fc, and hence we may proceed by induction. □ 

A useful and easily-derived corollary of Lemma IfTTl is that poly(Z', G,) is closed under 
dilations. 

Corollary 6.8 (Dilation of polynomial sequences). Suppose that g G poly(Z*,G.) and 

that a±, . . . , a t , b±, . . . ,bt G Z. Then the sequence n \-t g(ai + bxni, . . . , a t + b t n t ) also lies 
in G,. □ 

We remarked in the introduction that a sequence g : Z — > G is polynomial with 
coefficients in some filtration G, if and only if g has the form 

g(n) = a p 1 l{n) ...a p k k{n) (6.4) 

for polynomials pi, . . . ,pk with integer coefficients. Although this result is not required 
in the paper it is certainly conceivable that one might wish to apply the main theorems 
of the paper to a sequence which is presented in an explicit form such as (16.41) , and does 
not obviously satisfy the more abstract condition of Definition 11.81 

The fact that every polynomial sequence has the form (16.41) is an easy consequence of 
Lemma 16.71 To establish the converse, consider first the lower central series filtration 
G, which has degree s, the step of the nilpotent Lie group G. Let d be the maximum 
degree occurring amongst the polynomials Pi and define a finer filtration G' 9 of degree 
sd by setting G[ := Gui^. This is a filtration since 

[G'^G'j] = [Gfj/rf], Gfj/tf]] C Gii/^+u/d-] C G^y^ = G' i+j . 

Any sequence of the form n i— > a\*' , j ^ d has coefficients in G' 9 since G\ = G for 
i = 0,1, ... ,d and the (d + l)st derivative of such a sequence is trivial. Since g is a 
product of such sequences and poly(Z, G' 9 ) is a group we see that g G poly(Z, G' 9 ). 

We note that if G/T has a Q-rational Mal'cev basis adapted to the lower central series 
then, by the results of the appendix, there is a Q^^^-rational Mal'cev basis for G/T 
adapted to G' 9 . 
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We leave it to the reader to formulate and prove an analogous result for polynomial 
mappings from Z* to G. 



We are now in a position to attack the general case of Theorem l2.9l Our analysis of the 
Heisenberg example in £0 suggested that the argument will involve an induction on the 
degree d of G,. In that case there were two different scenarios in which we reduced from 
the case d = 2 to the case d — 1. Whilst the same is true in general, the introduction 
of genuinely polynomial sequences (rather than just linear ones) necessitates a further 
inductive loop on the quantity m* := m a b — mim, which we call the nonlinearity degree. 
To see why, consider the following slightly informal example. 



Example. Let G/Y be the Heisenberg example, and let g(n) = ( Q l a% \ , where 
a±,a2 and «3 are highly independent over Q. Then there is no horizontal character n 
of low frequency such that \\r] o g\\c°°[N] is small. 

Now we have dg = g and d l g = idc for % ^ 2, and so g has coefficients in the 
subgroup sequence G, defined by G( ) := G(i) := G@) := G, Gm := G^ := G 2 , and 
G(f) := {idc} for i ^ 5. With this choice we have G D = G xG. However g^ takes values 
in G x G2 G, and hence rj n o g^ = for any horizontal character rp with frequency of 
the form (a, b, —a, —b) G Z 4 . Thus, a lack of uniform distribution for g^ does not imply 
lack of uniform distribution for g. 

The problem in the above example is that the filtration G, was far too "coarse" to 
accurately capture the differential structure of the sequence g. Indeed g also takes values 
in the minimal (lower central series) filtration, as we saw in £j5j 

In the light of the above example we can expect that it will sometimes be necessary 
to pass to a "finer" filtration of the same degree d, in order to properly capture the dif- 
ferential structure of g. This finer filtration will have a smaller value of the nonlinearity 
degree m*, and thus we introduce an extra inductive loop to incorporate this parameter. 
To be precise we shall prove, by induction on d and m*, the following slight variant of 
Theorem 12.91 

Theorem 7.1 (Variant of Main Theorem). Let m, d ^ be integers with m* ^ m. Let 
< 5 < 1/2 and suppose that N ^ 1. Suppose that G/Y is a nilmanifold and that G, is 
a filtration of degree d and with nonlinearity degree . Suppose that X is a\j 5-rational 
MaVcev basis adapted to G, and suppose that g G poly(Z, G,). If (g(n)Y) n€ [ N ^ is not 
5-equidistributed then there is a horizontal character n with < \n\ S^ ™,™*,^ 1 ) such 
that 



It is clear that this does imply Theorem I2.9[ since the dependence of the 0(1) expo- 
nents on m* may be suppressed once Theorem 17.11 has been proven by induction. In 
our proof there will be an outer inductive loop over d and an inner one over m*. In 
other words we shall assume that Theorem 17.11 holds for all pairs (d', m'J in which either 
d' < d or for which d' = d and m'^ < m*, and deduce the case (d, m*). 



7. The general case of the main theorem 




V ° #||c°°[jv] < $ 
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Henceforth we allow all constants implicit in the <C or O-notation to depend on d, m 
and m*. 

We begin with some simple reductions. By Lemma \3. 71 we may assume that the orbit 
(g(n)r) ng [jv] is not tf^^-equidistributed along some vertical frequency £ G Z m<J with 
|£| <C 5"°^'. Thus there is some function F : G/T — > C with ||-F||Li P ^ 1 and vertical 
frequency £ such that 

\E ne[N] F(g(n)T) - [ F\ > <5°M. (7.1) 
./G/r 

If £ = then F is G^-invariant and we may descend to Gj Gd, together with the filtration 
G,/G(i which has length d — 1, and invoke our inductive hypothesis. We pause to give 
the rather straightforward details. 

Write G := G/Gd and V := T/(r fl Go)- Then G/r is a nilmanifold togther with a 
filtration G, of length d — 1, where := Gi/Gd- The Mal'cev basis = {Xi, . . . , X m } 
may be reduced to give a ^-rational Mal'cev basis X = {Xi, . . . , X m } for G/r adapted 
to G., where m := m — rrid- 

Write g : Z — ?■ G for the reduction of g(mod Gd) By the G^-invariance the function 
F descends to a Lipschitz function F : G/r — > C with H-FHup ^ ll-^llup, and so (17.11) 
implies that 



K ne[N] F(g(n)T) - /_ _F 
G/r 



>S\\F\ 



Lip- 



(Here we have used the fact that normalised Haar measure on G/T is obtained by 
quotienting that on G/T by Gd-) 

We may now apply the inductive hypothesis to obtain a horizontal character rj : G — > 
C on G of frequency magnitude < \r}\ <C <5~ 0(1 ) such that 

11*7° </||c<»[jv] < 

If we let 7] : G — > C be the horizontal character on G defined by r](x) = rj{x) then we 
have f] °~g = i] o g and |r/| = \rj\. This concludes the proof in the case £ = 0. 

Suppose henceforth that £ 7^ 0. Since F has £ as a vertical frequency, (17.1 j) becomes 

|E ne[ ^F(^Hr)|»5°( 1 ). (7.2) 

We proceed initially with two additional reductions. The first is to the case g(0) = idc- 
Factorize 0(0) = {g(0)}[g(0)] as in Lemma EH Set g(n) := {j(0)}-VW5(0) -1 {?( )}' 
Then we have |E ne[JV] F(<?(n)r)| ^ 5, where := F({^(0)}x). But F still has 

vertical oscillation £ and, by Lemma |A.5[ it has Lipschitz constant 0(1). Noting that 
\\v 9\\c°°[N] = \\v <7||c°°[Ar] we see that if we have Theorem 17. II for g then we also have 
it for g. 

The second reduction is to the case when |-?/>(g(l))| ^ 1 (this is needed in the lead 
up to ( I7.16P ). To do this, factorize g(l) = {g(l)}[g(l)] as in Lemma IA.14I Set g(n) := 
g(n)[g(l)}- n . Then g(n)T = g(n)T, g(0) = id G , g G poly(Z, G.) and 7t(g(n)T) = 
ir(g(n)T), so proving Theorem 17.11 for g is equivalent to proving it for g. 

Henceforth we assume g(0) = idc and |-0(g(l))| ^ 1. 
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As in §3 we apply Van der Corput's Lemma (Corollary 14. 2p to (17. 2p to deduce that 
for ^> 5°^N values of h, we have 

\E ne[N] F(g(n + h)T)F(g(n)T)\ » <5°«. (7.3) 

For each fixed h this may be interpreted as a statement about the polynomial sequence 
(g(n + h),g(n)) on the product group G 2 . However, guided by our experience with the 
Heisenberg group, it is natural to try and interpret it as a sequence on a somewhat 
smaller group. To this end, we define the nonlinear part g 2 of g by 

g 2 (n):=g(n)g(l)~ n . (7.4) 

Motivated by what we did in §0 we may then rewrite (I7.3P in the form 

\E ne[N] F h (~g h (n)T 2 )\ » 8°™, (7.5) 

where 

F h (x,y):=F({g(l) h }x)F^) 

and 

g h (n) := {{g(l) h y l g 2 (n + h)g{lY{g{l) h }, g 2 (n)g(ir). (7.6) 

It turns out that g h takes values in G n := G x Ga G, just as we found in our analysis 
of the Heisenberg case. To prove this note that have G 2 ^ [G, G], and so G becomes 
abelian after quotienting out by the normal subgroup G 2 . Thus we need only prove that 
g2^n) G G 2 for all n. We have d 2 g{n) = idc modulo G 2 . Since g(0) = idc, this implies 
by an easy induction that g{n) = g{l) n modulo G 2 , and so g 2 does indeed take values 
in G 2 . 

We may therefore replace (17.51) by 

Ke [N] F^(n)T a )\^5 ^ (7.7) 

by restricting everything in that equation to an object on G D . 

Note that, exactly as in the Heisenberg case, F° is invariant under G^ = {(gd,gd) '■ 
9d £ Gd}- Indeed, since Gd is central in G, we have 

F^((g d ,9d) • x D ) = F{{g{l) h }g d x)Fjg~^) 

= <^d))<-^)nw) h }^)W) 

Thus F° descends to a function F° on G D := G n /G^ and we may write (17.71) as 

|En e[ iv]^?(^Hr s )|»5 «, (7.8) 

where r°:=r n /(rnG^). 

The next proposition is central to our whole argument in that it clarifies the sense in 
which G u is "less complex" than G. 

Proposition 7.2 (Reduction in degree). Define (G u )i := Gi x G . +1 d for i — 1, . . . , d. 

Then (G n ), is a filtration on G n of degree d. Since (G n )d = G^, it descends under 
quotienting by G^ to a filtration (G n ). of degree d — 1 on G n . Each polynomial se- 
quence g^ lies in poly(Z, (G D ),) ; and hence each reduced polynomial sequence g° lies in 
poly(Z,(G^).). 
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Proof. We start with a lemma. 

Lemma 7.3. Suppose that H\,H2 and K\,K2 are normal subgroups of a group G, 
that Hi,H2 generate a group H and that K\,K2 generate a group K. Then [H,K] is 
generated by the groups [Hi,Kj], 1 ^ i, j ^2. 

Proof. The groups [Hi, Kj] are all normal, and thus the group they generate is also 
normal. If we quotient by that group, then Hi,H 2 commute with Ki,K 2 , and thus H 
commutes with K. The claim follows. □ 

Now observe that (G n )j is generated by G 2 +1 and G i . In view of the lemma it 
therefore suffices to establish that all four of the quantities 

iv^A /-tAi lyA ril 1 \/ r i2 /^Ai \s~i2 ry2 1 

lie in Gf + j. Using the fact that G, is a filtration, the first quantity is manifestly contained 

in Gf^j and the last three lie in Gj + j +1 . It follows immediately that (G n ). is indeed a 
filtration. 

Next we show that G poly(Z, (G D ).). Here we make serious use of the fact that 
poly(Z, G,) is a group for the first time. Recall that 

9k(n) := {{g{l) h r l g2{n + h)g(l) n {g(l) h }, ^,(n)p(l)») . (7.9) 

Now poly(Z, (G D ).) is a group, and it is also closed under conjugation by elements of 
G 2 . Since (g(l) n , g(l) n ) is obviously in poly(Z, (G D ).), it suffices to check that (g2(n + 
h),g2{n)) G poly(Z, (G D ).). Of course, gi G poly(Z, 0.) and hence, by Lemma loTTl it is 

a product of elements g\ with G Gj. It therefore suffices to show that (g> ' ,g\ ) G 

tn+h\ I n \ 

(G D ).. Taking jih derivatives, it suffices to check that g}'~ = g\ ■ (mod Gj+i). For 
j < % this follows from the fact that g^ G Gj, whilst for j ' ^ z it is trivial. □ 

In order to apply the inductive hypothesis, we must specify a Mal'cev basis X u for 
G n /T n adapted to the sequence (G 111 )., and it must then be checked that F° is Lipschitz 
with respect to the metric d-^. These are rather tedious matters and we recommend 
that the reader take the following lemma on trust on a first reading of the paper. 

Lemma 7.4 (Rationality bounds for the relative square). There is an O '(tf-oW) -rational 
Mal'cev basis X u = {X?,...,X° n } for G°/T n adapted to the filtration (G n ). with 
the property that ip x n{x,x') is a polynomial of degree 0(1) with rational coefficients of 
height S~°^ in the coordinates ip(x),ip(x'). With respect to the metric d x u we have 
II Fh II Lip ^ 5~°^ uniformly in h. 

Proof. We consider G u as a subgroup of G x G. Recall (cf. Definition |A.7[) the 
definition of a weak basis. It is clear that Xx X = {(X\, 0), (0, Xi), . . . , (X m , 0), (0, X m )} 
is a 5~ 0< - 1 - ) -rational weak basis for G/T x G/Y and that each of the groups (G n )j := 
Gi x Gi+1 Gi is -rational with respect to this basis. By Proposition lA.lOl it follows 
that there is a Mal'cev basis X D = {X^, . . . , X° n } for G D /T D , adapted to the filtration 
(G D )., with the property that each X? is a (-^-rational combination of the elements 
of X x X. By adding the elements (X 1; 0), . . . , (X miia , 0) to X n we obtain a weak basis 
3^ for G/T x G/T which enjoys the nesting property ( lA.ip . From Lemma [A. 21 it follows 
that each coordinate of ipy(x,x') is a polynomial of degree 0(1) and with coefficients 
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5 in the coordinates ip Xx x{x, x'). Restricting to those pairs (x, x') which lie in G a , 
we obtain the stated property. 



Recall that F^(x D ) = F({g(l) h }x)F(x'). Now by definition we have \4>x({g(l) h )}\ ^ 
1. By Lemma [A. 51 (and Lemma [A. 141 which guarantees that every x £ G/T has a rep- 
resentative with coordinates bounded by 0(1)) we see that (x,x') F({g(l) h }x)F(x') 
defines a function on GxG whose Lipschitz constant with respect to the product metric 
d x d is <C 8~°^ . Now by Lemma IA.6I and the construction of X u we therefore have 
II -^11 Lip ^ 5~°^ where, remember, the Lipschitz constant is being computed with 
respect to the metric d x n. □ 

Let us now resume the discussion starting from (17.81) . We begin by reprising some of 
the straightforward arguments at the start of the section (where we dealt with the case 
£ = 0). By reducing the first mP := vrP — elements of X u we obtain an 0(5~°^)- 
rational Mal'cev basis = {Xp, . . . , J&T} for tWT° adapted to the filtration (G n ).. 

m_ 

With respect to the metric d-^p we have Hi^Hup ^ 5~°^\ 



Since {G n ) 9 has degree d — 1 our inductive hypothesis is applicable and we conclude 
that for ^> 5°^ values of h G [N] there is some horizontal character rj h : G n — > R/Z 
with < \fj h \ < S~° m ^ and 

ll%°^llc-[iv] <<r°™«. 

By pigeonholing in h we may assume that rj = 1% is independent of h. Writing rj : G u — > 
R/Z for the horizontal character defined by r)(x) = fj(x), we see that < \r/\ <C 5~° m W 
and that 

\\v°9h\\c-[N\ «r° m(1) . (7.10) 

The next lemma, which is almost identical to Lemma I5.2[ allows us to write rj in 
terms of maps defined on G rather than C7 n . 

Lemma 7.5. We have a decomposition r)(g',g) = T]i(g) +ri 2 (g'g~ 1 ) for all (g',g) G G n , 
where r\\ : G — > R/Z is a horizontal character on G, and 7/2 : G 2 — > R/Z is a horizontal 
character on G2 which also annihilates [G, G2]. Furthermore we have \rji\, \i] 2 \ <C 5~°^ . 

Proof. If we define r)i(g) := rj(g, g) and ^2(^2) : = vi.92, id<s) for g G G and g 2 G G 2 then 
the decomposition follows since rj is an additive homomorphism. Since 77 annihilates 
[C7 D ,C7 n ], which contains [G A , G 2 x idg] = [G, G 2 \ x idc, we see that 772 annihilates 
[G, G2}; since rj annihilates r D , which contains both T A and (r n G 2 ) x idc, we see that 
771 and 772 annihilate T and r fl G2 respectively. 

It remains to check the boundedness properties. Writing 

r](x, x') = k u ■ ip x n(x, x'), 

where k D G Z mn , we have by definition that \k D \ <C 5~°( l \ The integer vectors k\ and 
k 2 used to define \rji\ and 1 772 1 are then given by 

k\ ■ ip(x) = r)i(x) = r)(x, x) = k D ■ ip x n(x, x) 

and 

k 2 ■ ip(x) = 7/2(2) = rj(x, id G ) = k D ■ ip x n(x,id G ). 
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That \ki\, \k 2 \ <C 5^°^ now follows immediately from the fact, established in Lemma 
I7.4[ that ip x u (x, x') is a polynomial of degree 0(1) with rational coefficients of height 
0(S~°^) in the coordinates ip(x),^(x'). □ 

Now let us return to fl7.10p . and reinterpret this in terms of the decomposition of 
just given. Recalling the formula (17. 9p for g^(n) we therefore have 

v(9h( n )) = vMn)) + V2({ 9 (l) h }- 1 92(n + h)g{lY{g{l) h }g{l)- n g 2 { n y l ) 

which, since r\ 2 vanishes on [G, G 2 ] , is equal to 

Vl (g(n)) + V2 (g 2 (n + h){g{l) h y 1 g^) n {g^) h }9^r n 92{n)- 1 ) 

=Vi(g(n)) + V2( 9 2(n + h)) - V2 (g 2 (n)) + ^(l)*}" 1 g(l) n {g(l) h }g(l)- n ). 

Now one easily verifies by induction on n that y~ 1 x 1l yx~ n = [x, y] n (mod [G, [G,G]}). 
Since r] 2 annihilates [G,G 2 ], which contains [G, [G, G]], we can therefore simplify the 
above a little further to 

V(9h(n)) = Vi(g(n)) + rj 2 (g 2 (n + h)) - n 2 {g 2 {n)) + n V2 ([g(l), {g(l) h }}) 

:= P(n) + Q(n + h)-Q(n) + cr(h)n, (7.11) 

where P, Q : Z — > R/Z are polynomial sequences of degree at most d. 

The next lemma is specifically designed to handle the situation that has arisen here. 
In this lemma it is convenient to reprise a notation from earlier papers of ours (such as 
[TTj): if a G R/Z and Q > 1 we write ||a||R/z,Q := infi^Q ||?Q!||r/z. In a similar spirit, 
for any / : Z — > R/Z define 

||/||c-[JV],Q := ^nf \\qf\\c°°[N]- 

Lemma 7.6 (Polynomials lemma). Suppose that P, Q : Z — > R/Z are polynomial se- 
quences of degree at most d with -P(O) = and Q(0) = dQ(0) = and that a : [N] — > 
R/Z is an arbitrary map. Suppose that there are 5°^N values of h e [N] such that 

\\P(n) + Q(n + h)- Q{n) + a(h)n\\ c ^[ N ] < <T°« 

Then W&QW^s-om < S' ^/^ for 1 ^ 3 ; and 

||P(1) + ah + a (h) || w _ 0(1) « 6-°W/N 

for S°^N values of h e [iV] ; where 

a:=d 2 Q(0). (7.12) 

Proof. The assumption implies, looking at the second derivative at n = 0, that 

H«9 2 (p - q)(o) + 9 2 g(/i)|| M/z « r°«/iv 2 

for ^> 6°^N values of h e [AT]. Applying Lemma [4.51 then implies that 

||<9 2 (P - Q)(0) + a 2 g|| Coo[iV])a -o(i) « 5-°^/N 2 . 
Thus, as stated, we have 

\\d l Q\\ w -o W « S-OW/N* 
for i ^ 3, which means in view of the Taylor expansion of Q that we can write 

Q(n) = a( n \ +R(n), 



THE QUANTITATIVE BEHAVIOUR OF POLYNOMIAL ORBITS ON NILMANIFOLDS 39 

where R(0) = R(l) = R{2) = and ||i?|| c°°[N],8-°w "C 5 ot ~ l \ Substituting back into 
our assumption yields that 

P{n) + {ah + a(h))n + R(n + h) - R(h) + a ( k ) < <T° (1) 

for ^> S°^N values of h G [N]. Differentiating at zero and recalling that -P(O) = we 
obtain 

||P(1) + a{h) + ah + dR{h) || R/Z « 8-°V/N, 
which implies in view of the properties of R that 

||P(1) + a{h) + ah\\ w - OW « 5-°^/N. 

This completes the proof. □ 

Now let us recall ( ITTiTj) . We know that \\r] o #^||c°°[iv] < for > 5° W N values 

of h, so let us apply the lemma with P := r]i o g, 

Q:=r)2°92, (7.13) 
and <r(/i) := ^([^(l), {g(l)} h ]). By pigeonholing in h we see that there is some q ^ 

hvMi)) + qm([g(i), {g(i) h }]) + qah\\ R/z « 6-°W/N. 

By redefining rji and 772 (none of the boundedness properties of Lemma 17.111 are lost by 
doing this) we may write this as 

\\ Vl (g(l)) + r) 2 ([g(l), {g(l) h }]) + qah\\ m « 5~°^/N. (7.14) 

We now proceed as in §|5j using Mal'cev bases to work with explicit bracket polyno- 
mials. 

Since r\ 2 annihilates [G, [G, G)] C [G,G 2 ], we see that the map x (-)■ r} 2 ([g(l), x)) is a 
homomorphism. Thus there exists ( G W 11 such that 

r? 2 ([#(l),a;]) = C-V^)(modZ) (7.15) 

for all x G G. Since r] 2 annihilates [G, G 2 ], all but the first m\\ n coordinates of ( are 
zero. Since we have reduced to the case |^(^(1))| ^ 1 and the basis X is ^-rational it 
follows that |C| < 5-°^. 

We now define (3 := rji(g(l)) and 7 := ip(g{l)). Now since [G,G] C G 2 the map 
V'lin : G ~~ ^ IR miin which picks out the first m\\ n Mal'cev coordinates is a homomorphism, 
and therefore the first m\\ n coordinates of ip(g(l) h ) are just 7/1. We may now rewrite 
CLU as 

+ qah + C ■ {tMIIk/z < <T° (1) /iV (7.16) 
for > 5°M/V values of h G [TV]. 

This assumption is the same as in Proposition 15.31 except that we do not have a 
bound on \qa\. However, we have 

Claim 7.7. At least one of the following statements holds: 

(i) There is r 5~ 0( -^ such that ||r£j(mod Z)|| K / Z <C 5~°^/N fori = 1, . . . ,m^ n ; 

(ii) There exists k G Z m ^, < |Jfe| < such that \\k ■ j\\ R/z < 5~°^/N. 
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Proof. We apply Proposition 15.31 with (' : = (£, 1) £ R miin x R, 7' := (7, get) £ 
R m " n x R and a' := 0, deducing that either \Q\ < 5^° {l) /N for all i = 1, . . . , m Un (in 
which case (i) holds) or else there exist k £ Z miin and r £ Z, not both zero and with 
\k\, \r\ < such that ||fc ■ 7 + gra|| R / z < 6~°^/N. If r = then (ii) holds, so 

assume that r ^ 0. Multiplying (" 17. 1 6[) through by r we see that for ^ 5N values of 
h £ [N] we have 

\\p + ah + {- {jh}\\ m «r°«/iv, 

where /3 := r/3, a := {/c • 7 + grct} satisfies \a\ ^ 8~°( 1 '/N and C := r C ~ Thus 
we may apply Proposition 15.31 once more to conclude that either \Q\ <C 5~°^/N for 
% = 1, . . . , miin, which implies (i), or else there is a nonzero £ Z miin such that \\k ■ 
tIIr/z ^ 5~°^/N, which implies (ii). This establishes the claim. □ 

If Claim fTTTT ii) holds then consider the map rj : G — > R/Z defined by 

r](x) := k ■ ^(a;)(mod Z). 

Since k £ Z miin , 77 is a horizontal character and we have \rj\ = \k\ <C 5~°^\ Finally we 
have 

f] o g(n) = r](g(l) n ) = nk ■ 7(mod Z), 

and so \\r) o <7 1| 00 [a^] J -0 ^. This completes the proof of Theorem 17. II in this case. 

Suppose then Claim l777T i) of the claim holds. For each i — 1, . . . , m consider the map 
n : G ->• R/Z defined by 

Ti(x) : = rj/a([z,exp(Xi)]). 

Since [r, T] C T and [G, G] C G 2 we see from the properties established in Lemma 17.51 
that Tj is a horizontal character which annihilates G 2 - It is not hard to establish that 
|Tj| «C <5~° ( - 1 * ) . To do this we write (as usual) 

Ti(x) = ki ■ ■0(x)(mod Z), 

where fcj £ Z m (and in fact fcj £ Z miin since Tj annihilates G2). From the definition of 
Tj, the bound r <C 5 -0 W, the ^-rationality of the basis A' and Lemma [A. 31 we have 

(h)j = n(exp(Xj)) = rr ?2 ([exp(X J ),exp(X J )]) < <T° (1) , 
and so indeed |r,| = |^| <C J -0 ^ 1 ). Now we have 

n o g(n) = nTi(g(l)) = rn(i(mod Z) 
where the last equality follows from (17.15p . By property (i), this implies that 

\\n o g\\c°°[N] < 

and so once again we have proved Theorem 17.11 unless r, = for all i — 1, . . . , m. 

So far we have been successful in deducing Theorem 17. II by induction on the degree d, 
but we know from the example at the start of this section that it is not always possible 
to make such a deduction as G, may be "reducible" for g. It turns out that the case 
we have not yet covered corresponds to this situation. 

Suppose then that 7$ = for all i, so that i]2([x, exp(Xj)]) = for all x £ G and all 
i £ [m]. Since the homomorphism r\ 2 annihilates [G, [G,G]} C [G, G 2 ], we see using the 
identity = [x, z)^" 1 , [x,y]][x,y] that the map y \-t r] 2 ([x,y]) is a homomorphism 
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for any fixed x. It follows that r] 2 ([x, y]) = for all x,y G G, or in other words that r\ 
annihilates Thus ( = (cf. (I7.15P ) and ( I7.16P degenerates to 

\\f3 + qah\\ m <^5-°^/N 

for ^> S°^N values of h G [N]. By Lemma [3.21 this implies that 

l|a|| w -o ( i) <r°«/iv 2 , 

and thus by (1712]) 

\\d 2 Q\\ m ,s-o W ^S-°^/N 2 . 
where Q was defined in (17.131) . We have Q(0) = Q(l) = and, by Lemma I7.6j 
W&Qhfas-ow < S-OW/N* for i ^ 3. Thus 

||^2 O g2\\c°°[N],8-°<» < 

Thus there exists q, 1 ^ q ^ 5~°^\ such that 

11^2 °g2\\c°°[N] < <5~° {1) . 
For notational simplicity we rename g77 2 as 7/2, thus 

||%o(7 2 ||c o [ ^ ] «r W. (7.17) 

Roughly speaking, this statement means that g exhibits some essentially linear be- 
haviour (in the "direction" orthogonal to 772) inside G 2 . For our purposes this means 
that G 2 was too large to accurately capture the quadratic and higher order terms of g, 
and we must pass to a finer filtration G' m which does not have this drawback. This is 
the point in the proof where we induct on the nonlinearity degree m*. 

Now 7] 2 : G2 — > R/Z has the form 

V2( x ) = k ■ T/>(x)(mod Z), 

where k G Z" 12 C Z m satisfies \k\ < In the ensuing; discussion we will also need 

the lift rj 2 : G 2 -> K defined by 

Now the map # : G 2 x G 2 — >■ M defined by 6*(x, £/) := f\ 2 [xy) — fj 2 (x) —fj^iv) is continuous, 
Z- valued and vanishes when x = y = ida- Since G 2 x G 2 is connected it follows that 
8 = identically, and hence the lift fj 2 is a homomorphism. 

Lemma 7.8 (A finer subgroup sequence). Define G' = G[ = G and G\ = Gi D kerf} 2 
for i ^ 2. T/ien G". = (G-)^ a filtration with degree at most d and nonlinearity 
degree m!^ ^ — 1. Each G\ is closed, connected and 5~°^ -rational (with respect to 
our Mal'cev basis X on G/Y adapted to G,). 

Proof. Let tt : G 2 — > G 2 /[G 2 , G 2 ] be the natural projection. It follows from the 
Baker-Campbell-Hausdorff formula exp(X) exp(F) = exp(X + Y + \[X, Y] + . . .) that 
7r o exp : g 2 — » G 2 /[G 2 , G 2 ] is a linear map. Since fj 2 : G 2 — > K factors through 
G2HG2, G 2 ] it follows that 7/ 2 oexp : g 2 — ► K is also a linear map. For i = m lin + 1, . . . , m 
we have fj 2 o exp(Xj) = ki, an integer of magnitude 0(S~°^). Thus by simple linear 
algebra we see that each Lie algebra q[ = Qi D ker(fj 2 o exp) is spanned by 0(5~°^)- 
rational combinations of the JQ. Thus the G[ are 0(5 _ ° ( ' 1 - ) )-rational closed connected 
subgroups as claimed. 
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If i,j ^2 then it is clear that Gj] C G' i+ j since r] 2 : G2 — ► ^ is a homomorphism. 
We must also check that C G' i+1 for i ^ 2, which follows from the fact that 

[G, Gi] C [G, G2] Q ker?72. The statement about m'^ is immediate from the fact that 77 
is nontrivial, and it is obvious that the degree of G' 9 is at most d. □ 

We now come to the main result of this section, which allows us to pass to a new 
sequence g' G poly(Z, G' 9 ) with smaller nonlinearity degree than g. 

Lemma 7.9 (Factorization lemma). Suppose that (I7.17P holds. Then we may factor 
g = eg'^j, where 

(i) e G poly(Z,G.), e(0) = id G; e is {S~ ^\ N) -smooth (cf. Definition 

and \\r] o ellcoowi <C 5~°^ for all horizontal characters r\ : G — > R/Z with 
< ||77|| < r «; 

(ii) </Gpoly(Z,G'.); 

(hi) 7 G poly(Z, G m ) and j(n)T is periodic with period Q <C 5^°^. 

We remark that this lemma is strikingly similar in form to Proposition 19.21 below. 
The proof of the latter result will, in fact, be closely modelled on the proof of this one, 
but will be rather easier. 



Proof. By Lemma [6.71 and the fact that #2(0) = #2(1) = idG we have 



where U G M m and the coordinate (ti)j is equal to if j ^ m — mi. Thus 

f)2°g2{n) = ^k- u( n ) 

i=2 ^ ' 

From (I7.17P we thus have 

\\k-t t \\ m <^5-°W/N\ 

i = 2,...,d. Since \k\ <C 5~°^' we may choose vectors Ui G M m with (ui)j = if 
j ^ m — mi such that \ti — Ui\ <C 5^°^ /N l and k • G Z for i — 2, . . . , d. 

We may now pick vectors Vi in W n with (vi)j = if j ^ m — mi, all of whose 
coordinates are rationals over some denominator q <C 5^°^, such that k ■ Ui = k ■ Vi for 
i = 2, . . . , d. 

Define sequences e, 7 : Z — > G by 

^( e N) : =Z) and ^(7W):=S(") V <' ^ 7 - 18 ) 

and set 

#'(n) := e^^^n^n) -1 . 

Observe from Lemma [6.71 that £,7 lie in poly(Z, G,) and take values in G 2 . We verify 
the properties of e, g' and 7 in turn. 

That e(0) = idc is obvious. To see that e is (S~°^\ iV)-smooth we must confirm that 
d(e(n),e(n— 1)) <C 8~°^' /N for all n G [iV]. Now as a fairly immediate consequence of 



THE QUANTITATIVE BEHAVIOUR OF POLYNOMIAL ORBITS ON NILMANIFOLDS 43 

the definition of e we have that 

\ip(e{n)) - ^(e(n - 1))| < cT 0(1) /iV 

and 

|V(e(n))| <<T° (1) 

for all n G [N] . The smoothness therefore follows from Lemma IA.4I Finally we must 
establish the statement about g o e, where g : G — ?■ R/Z is a horizontal character. It is 
clear that any horizontal character rj : G — > R/Z is represented in coordinates as 

rj(g) = k ■ ip(g)(mod Z), 

where ki = ^(exp(X-)) and so in particular \k\ <C 5~°^ if \\r)\\ <C 5^°^. It follows 
immediately from the definition of e that \\g o e||c°°[Af] "C 5~ olyl \ as required. 

Next we show that g' G poly(Z, G' a ). Now we have 

g\n) = e-'i^ginMn)- 1 = e^g^n^ny 1 ■ g(l) n • [^(1)^, 7 (n)]. 

The first derivative of the sequence n \— > g(l) n is g(l) and all higher derivatives are 
just id<2, so this sequence has coefficients in any subgroup sequence. Also the sequence 
[g(l)~ n , 7(n)] lies in poly(Z, G' 9 ) since it is in poly(Z, G m ) and takes values in [G,G 2 ], 
which is annihilated by rj. 

By the group property of poly(Z, G' m ) it therefore suffices to check that e~ 1 g 2 ^r 1 G 
poly(Z, G".). Since this sequence lies in poly(Z, G,), we need only check that it is 
annihilated by fj 2 , that is to say that 

-7/(7(71)) - r]{e{n)) + r}(g 2 (n)) = 0. 

Computing using coordinates we see that the left-hand side here is 

d 

■ (-Vi + Ui-ti + U) 

i=2 

which does indeed vanish by our construction of Ui and v j. 

Finally we must check that ^{njT is periodic. By definition and Lemma [A. Ill we see 
that 7 is 5 _0 ^ 1 )-rational (cf. Definition II . 1 7[) . and then the result follows instantly from 
Lemma lA~T2l (ii). □ 

We will shortly be completing the proof of Theorem 17. II in the case that (I7.17P holds, 
which is the only case left to handle. We isolate a technical lemma which allows us to 
deduce C°° [A^]-properties of polynomials p(n) from properties of p(an + b). 

Lemma 7.10 (Single-parameter extrapolation). Suppose that Q,N ^ 1 are integers 
and a,b are rationals with height at most Q such that b ^ 0. Let p : Z — > IR/Z be a 
polynomial sequence of degree d and write p(n) := p(a + bn) . Then there is some q G Z ; 
1 ^ |<?| <d Q° d{1) , such that 

\\qp\\c°°{N] <d <5° d(1) \\p || &°[Nl- 

We will defer the proof of this lemma to the next section, in which we prove a more 
general multiparameter version of it (see Lemma 18.41) . 

Recall now that in our efforts to prove Theorem 17.11 by induction we had reduced 
to the following situation: g : Z — > G is a polynomial sequence with g(0) = idc and 
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^(^(l))! ^ 1, and there is a function F : G/Y — > C with nontrivial vertical oscillation 
£ and H-FHup ^ 1 such that 

\E ne[N] F(g{n)Y)\>5. 

Furthermore we reduced to the case when g is "reducible" in the sense that (17.1 7p holds. 
This allows us to factor g as in Lemma 17. 9\ obtaining 

\E ne[m F(e(n)g'(n)j(n)Y)\ > 5. 

Choose aQ< 5~°^ such that j(n)Y is periodic with period Q, and split [N] up into 
progressions of length between N' and 2N', where N' := [^iVj , and common difference 
Q. By the pigeonhole principle, there is some such progression {n + nQ : n G [N']} 
such that 

\E nem F (e(no + nQ)g'{n + nQ){ 7 (n )}r) | ^ 5/2. 

Now since £ is (S~ ^\ iV)-smooth we see, using the right-invariance of d, that if C is 
sufficiently large then 

\E nEm F(e(n )g'(n + nQ){j(n )}Y)\ >8/A. (7.19) 

Now g' G poly(Z, G' 9 ) and hence, by Lemma I6\"5| the sequence 

g( n ) '■= {g( n o)Y l t( n o)g\ n o + n Q){i(. n o)} 

is also in poly(Z, G' 9 ). The inequality (17.1 9p may be rewritten as 

\E nem F(g(n)Y)\>6/4, (7.20) 

where F(x) := F({g(no)}x). By Lemma IA.5I we have ||-F||Lip "C Noting that 

g(0) = idc, we may thus apply the inductive hypothesis that Theorem 17.11 holds with 
parameters (d,m* — 1), deducing that there is some horizontal character fj with < 
\\fj\\ < <5~° (1) such that 

\\fj ° g\\c~iN) < 

From Lemma 17.101 and the definition of g it follows that there is a horizontal character 
r] with < \\r]\\ < <5~° (1) , such that 

\\V ° 9"\\o°[N\ < 

where 

</'(n) := {^(no)}- 1 £(no)^(n){ 7 (no)}. 
Since g'(0) = ida, it follows that 

\\v ° g'\\c°°[N] < 

To complete the proof of the result we must, of course, replace g' by g := sg'j. To do 
this, note first that by multiplying i] by an integer of size 0(S~°^) if necessary we in 
fact have 

\\V ° l\\c™[N] = 0, 

since the Mal'cev coordinates ^(7(n)r) are always rationals over some denominator 
<g From the property (i) of Lemma 17.91 we have that \\rj o e\\c^[N] *C 

Putting all this together, we obtain 

\\V ° g\\c^[N] ^ \\V ° ^Hc°°[7V] + ||?7 ° g'\\c°°[N] + \\V ° l\\c^[N] < <5~° (1) , 
completing (at last!) the proof of Theorem 17. 1[ □ 
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Let us remind the reader that, by remarks immediately following the statement of 
Theorem 17.11 we have also completed the proof of Theorem 12.91 

8. The multiparameter Leibman theorem 

We have proved one of our main results, Theorem 12.91 In this section we bootstrap 
this result into a multiparameter version of itself. Strictly speaking, this step is not 
necessary in order to establish any of the results stated in the introduction, however 
the arguments here are not terribly difficult, and will be needed in order to obtain 
multiparameter analogues of the those results. 

Recall from |6] the definition of poly(Z*, G,), the group of polynomial sequences g : 
Z* — > G with coefficients in G,. Recall also the definition of, and notation for, multibi- 
nomial coefficients (j). 

We need an analogue of the smoothness norms C°° [N] in the multiparameter setting. 
To set these up, we introduce the Taylor coefficients of a polynomial map g : Z* — > R/Z. 

Definition 8.1 (Taylor expansion). Suppose that g : Z* — > R/Z is a polynomial map. 
Then we define the Taylor coefficients aj £ R/Z for j £ Z* to be the unique elements 
of R/Z such that 

3 

for all n; it is not difficult to verify the existence and uniqueness of these coefficients, and 
to check that if g has degree at most d then aj = unless \ j\ ^ d, where \ j\ := ji + - ■ -+jt- 

Definition 8.2 (Smoothness norms). Suppose that g : Z* — > M/Z is a polynomial map 
with Taylor expansion 




3 



Then for any t-tuple N = (N%, . . . , N t ) for N u . . . , N t ^ 1 we write [N] := [JVi] x . . . x [N t ] 
and 

IMIc°°[v] ■= supN J \\aj\\ R /z, 

where W := iVf . . . iVf . 

We have the following generalisation of Lemma 12.81 

Lemma 8.3 (Smooth polynomials vary slowly). Let g : Z* — > R/Z be a polynomial 
sequence of degree at most d and suppose that n £ [N] . Then for any i £ [t] we have 

\g(n) - g(n - e<) | < t , d jj-\\g\\ c °°iNY 
where e*j = (0, . . . , 0, 1, 0, . . . , 0) is the i th basis vector ofJ}. 

Proof. From the Taylor expansion and binomial identities we have 
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Thus 

\g(n) - g(n - e,)\ < w \\g\\ c ~ [M] L ^ (j_ gj <M j^h 



N .\\y\\c°°[N\i 



\j\<d 



as required. □ 

We now give a multiparameter version of Lemma 17.101 which implies that lemma as 
the t = 1 special case. 

Lemma 8.4 (Multiparameter extrapolation). Suppose that t,Q,N±, ... ,N t ,d ^ 1 are 
integer parameters and that di,bi G Q ; i = 1, . . . ,t are rationals of height at most Q 
with hi 7^ 0. Let p : Z* — >■ IR/Z 6e a polynomial map of degree at most d and write 
p(n) := p(ai + &ini, . . . , at + Then there is some q G Z, |g| <Cd,t Q ^*^, such that 

Proof. First of all observe that, if a, b G Q are rationals with height at most Q and 
b 7^ 0, we may expand 



where c(a,b, j' , j) is a rational number with height Oj(Q° j ^). Indeed we clearly have 
c(a,b,j,j) = b~i , and we may then compute c(a,b,j — l,j),c(a,b,j — 2, j), ... in turn. 

Multiplying such relations together we obtain a multiparameter version, viz. 

t 



n( ( ' ! -"; ,)/6, )-E^M'.i)(j) 



where j' ^ j means that each component of j' is at most the corresponding component 
of j. 

Applying this allows us to give the Taylor coefficients ctj of p in terms of those of p. 
Indeed we have 

vm = p( Wl ~ ai ni - at ) = yn (fa -? i)/bi \& = T V (?W b ? Ti&- 

and so 

aj = ^2c(a,bJJ')aj,. 

To obtain the lemma, we simply need to take q to be the product of all the denominators 
of the rationals c(a,b,j,j'), which is clearly <Cj* t Q° d - t<yl \ □ 

Definition 8.5 (Multiparameter equidistribution) . Let G/Y be a nilmanifold and let 
5 > 0. An finite sequence {g{n)Y)^ & p in G/Y indexed by a finite non-empty set P is 
5-equidistributed if we have 



J2 F (9(n)Y)- [ F 

neP Jg ' t 



<<Wlii P 
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for all Lipschitz functions F : G/Y — > C. If N = (N\, . . . , N t ), we say that a sequence 
(^(n)r)jj 6 [jv] is totally S-equidistributed if we have 



£ F(,?(n)r) - I F 
'1X...XP, */G/r 



HePiX...xP t 



<S\\F\ 



Lip 



whenever P; are arithmetic progressions in [iVj] of length at least <5iVj for each 1 ^ % ^ t. 



We can now give the multiparameter version of Theorem 12.91 

Theorem 8.6 (Multiparameter quantitative Leibman theorem). Let s,m,t ^ 1 and 
< 5 < 1/2, and let Ni,...,N t ^ 1 and d ^ 1 be integers. Suppose that G/Y is 
an m- dimensional nilmanifold equipped with a | -rational Mal'cev basis X adapted to 
some filtration G, of degree d, and that g G poly(Z*, G m ) . Then either (g(n)Y) H£ ^ is 
5-equidistributed, or else there is some horizontal character n with < \\n\\ <C ft-Od,™,^ 1 ) 
such that 

II'/ ° y\\c°°[N\ ^ 



Proof. We allow all implied constants to depend on d, m and t. Suppose that 
(g(n)Y) H( -^ is not 5-equidistributed. Suppose to begin with that N\ ^ 5~ c . 

A simple averaging argument confirms that, for ^> 5 ^'N2 ■ ■ ■ N t values of (ri2, • • • , n t ) G 
[N 2 x • • ■ x N t ), the polynomial sequence (g n2 ,...,nt( n ))^)ne[N 1 } is not 5°^^-equidistributed, 
where g n2 ,...,n t ( n ) '■= g(n,n 2 , . . . ,n t ). 

For each such tuple (n 2 , . . . , nt), Theorem 12.91 implies that there is some horizontal 
character r/ n2j ... jnt with < \\r)\\ <C such that 

||^°5 , n 2 ,...,n t ||c-[V 1 ] < <T 0(1) . 

By pigeonholing in n and passing to a thinner set of tuples (n 2 , . . . , n t ) we may assume 
that 77n 2 ,...,nt does not depend on (n 2 , . . . , n t ). Writing p := n o g and expanding 

p-= 5^^i( n2 '---' n *)( ni )' 

ii=0 \^/ 

where the are polynomials, we therefore see that 

\\ Pil (n 2 ,...,n t )\\ m <^5- ^/Ni\ (8.1) 

for ^> 5°^ 1 'N 2 . . . Nd values of (n 2 , . . . , n t ), for each i\ = 0, . . . , d. In particular (for each 
zi) there are > <5 0(1) A^ 3 ...N t values of (n 3 , ...,n t ) for which flSTTj) holds for > 5° (1) A^ 2 
values of n 2 . 

Suppose that i\ > 0. Writing 



(n 2 , . . . , n t ) = ^2 Ph,i2 ( n 3, ■ ■ ■ , nt) I ^ j 



and applying Lemma I4.5[ we see that for ^> 5°^N^ . . . N t tuples (713, . . . , n t ) there is 
5^(7x3, . . . ,n t ) -C such that 



| gil (n 3 , . . • , n t )p ilM (n 3l . . . , n t )\\ R/z < 5 ^/N^N, 



12 
2 • 
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Note that the application of Lemma 14.51 is valid because %\ > and Aq ^ b~~ c ] 
this guarantees that the parameter e in that lemma is small enough. Pigeonholing 
in (n.3, . . . ,n t ) and passing to a somewhat smaller set of these tuples we may suppose 
that qi x = q^ns, . . . ,n t ) is constant. 

We now continue in this vein, obtaining successively quantities qi lt i 2t ... t i r <C 6~ ^\ At 
the final stage we obtain 

||g ill ... J ^ 1> ... )i Jk/ Z <<r°«/^ i ...ivr 

or, in our earlier notation, 

H<8PillR/z « 5-°W/N\ (8.2) 

This has been obtained for all i with %\ > on the assumption that Aq ^ 5~ c . By 
switching the indices ii, . . . ,i t if necessary, we may in fact obtain such a q^ whenever 
there is some r with A* r > 5~° . If this is not the case for any r then (18. 2p holds anyway 
for trivial reasons (for any <C 5~ 0( -^). 

Note that by construction the p^ are simply the Taylor coefficients of p. 
Taking q := Yh^i we see ^ na ^ 9 ^ 5^°^ and that 

Wqprh/z^s-^/N 1 

for each index i and thus 

\\qv°9\\ c °°[it] <<^° (1) - 

The theorem follows. □ 



9. A MULTIPARAMETER INITIAL FACTORIZATION THEOREM 



Having just established Theorem 18. 6[ we now use it to obtain an initial factorization 
theorem for multiparameter polynomial sequences. We first give a multiparameter ver- 
sion of Definition I1.18[ the definition of a smooth sequence (the multiparameter version 
of a rational sequence is obvious). 

Definition 9.1 (Multiparameter smooth sequences). Let G/Y be a nilmanifold with 
a Mal'cev basis X . Let (e(n)) ne z* be a multiparameter sequence in G, let M ^ 1 be 
an integer and let A = (Aq,...,A t ) with Aj ^ 1 for all i. We say that (e(n)) ne z t 
is (M, N)-smooth if we have d(e(n), idg) ^ M and d(e(n),e(n — e*j)) ^ M/A^ for all 
n G [A]. 

Here, then, is the main result of this section. 

Proposition 9.2 (Factorization of poorly-distributed polynomial sequences). Let s, m, 
t ^ 1, let < 5 < 1/2, and let Aq, . . . ,N t ^ 1 and d ^ fee integers. Write A : = 
(Ai, . . . , At). Let G/r fee an m- dimensional nilmanifold with a ^-rational Mal'cev basis 
X adapted to a filtration G, of degree d, and suppose that g G poly(Z',G). Suppose 
that (<?(w)r)jj e nvi i s n °t totally 5-equidistributed. Then there is a factorization g = eg lr ), 
where e,g', , ~f G poly(Z',G.) are polynomial sequences with the following properties: 

(i) e : Z* C7 (O^W 1 )), N) -smooth; 
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(ii) g' : Z* — > G' takes values in a connected proper subgroup G' of G which is 
(3(,5- d,m,i(i)) -rational relative to X; 

(iii) 7 : Z* — > G is 5~° d > m > t ( l > -rational. 

Proof. We will allow all implied constants to depend on d, m and t. 

We first reduce to the case g(0) = ida, by factorizing g = {g(0)}g[g(0)] where g is the 
polynomial sequence ~g := {0(O)} _1 0[0(O)] _1 , for which g(0) = id G . If (fl^T)-^ is 
not totally 5-equidistributed, then one easily verifies using Lemma lA~5l that (g(n)T) He ^ 

is not totally 5-equidistributed for some 5 3> 8°^ l > . Applying the proposition to g, we 
obtain a factorization g = Sg'j. Setting e := {g(0)}e and 7 := 7[gr(0)], we certainly have 
g = eg'^j. The sequence 7 is 5~°^ -rational by Lemma I A .111 and (the multiparameter 
version of) Lemma [A. 121 The sequence e is (S~ ^\ iV)-smooth by Lemma [A. 5 1 

Henceforth, then, we assume that g(0) = ida- By hypothesis, we can find progressions 
Pi := {ai + biUi : rii G [N[\} in [iVj] with N- ^ SNi such that the polynomial sequence g : 
Z* — > G defined by g(n) = g(ai + birii, . . . , a t + b t n t ) is such that (g(n)T) He ^^ fails to be 

5-equidistributed, where N' := (N[, . . . ,Nl). by Lemma EH we have g G poly(Z*,G.). 
Applying Theorem 12.91 we conclude the existence of a horizontal character fj : G — > R/Z 
with < ||r7|| < such that 

11*7° <?Hc~[iV'] < <5~° {1) . 

At the expense of worsening the exponent of the 5~° < - 1 ' ) , we may replace [N'] here by 
[iv]. Applying Lemma \SA\ we deduce that there is a horizontal character 77 : G — >• R/Z 
with < \\t)\\ < such that 

h ^llc-[A?]« r ° (1) - (9- 1 ) 

Take G' to be the connected component of ker(?y). Then G' is rather clearly a subgroup 
of G which is 0(5 _ ° ( ' 1 - ) )-rational relative to X. 

Write 

^H) = £%(/)' 

where tj G M. m . By Lemma 16.71 we know that the coordinate (tj)i is equal to if 
i ^ m — mgi- The horizontal character 77 is given in coordinates by 

where < <^° (1) , and (ED tells us that \\k ■ tj\\ R/z < 5~°^/N^ for all J 7^ 0. Since 
<C 5"°^' we may choose vectors uj G M mi such that \tj — uj\ <C 8~ ( x > /N^ and 
k ■ uj G Z for all j 7^ 0. We then choose vectors t>j G M mi , all of whose coordinates are 

rationals with complexity at most 0(<5~° ( - 1 * ) ), such that k ■ uj = k ■ v j for all j 7^ 0. We 
may insist that the uj and t> j have the same support properties as the tj, namely that 
(uj)i = (vj)i = if i ^ m - mg. 
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Define polynomial sequences e, 7 : Z* — > G in terms of their Mal'cev coordinates by 
^«n)) = - uj) (jj and ^( 7 (n)) = ^ , 

and 

sf ■= '//" • 

By Lemma ItxTl and the fact that poly(Z',G.) is a group we see that all three of e,g' 
and 7 lie in poly(Z*, G % ). We must check the claims (i), (ii) and (hi). The claim (ii) is 
clear. To prove (i), that is to say that e is (5~° ( - 1 \ AQ-smooth, we need to show that 

d{e{n),e(n-e l )) < S~ ^/Ni 

for n G N. But as a fairly immediate consequence of the definition of e we have the 
bound 

|V(e(n)) - V(e(n - 3)) I « ^°«/^, 
and so the desired bound follows from Lemma IA.4I Finally we note that (iii) follows 
immediately from the definition of 7 and the properties of rational points described in 
Lemma IA.11I □ 



10. A MULTIPARAMETER COMPLETE FACTORIZATION THEOREM 



The last major task of the paper is to iterate Proposition 19.21 to deduce our a multi- 
parameter version of our main result, Theorem 1 1.191 We first need a technical lemma. 

Lemma 10.1 (Product of smooth sequences is smooth). Let G/Y be a nilmanifold of 
dimension m and let M ^ 2 and Ni, . . . , N t ^ 1 be parameters. Suppose that X is an 
M -rational Mal'cev basis for G/Y adapted to some filtration G, of degree d, and suppose 
that the maps 81,62 : Z* — > G are (M, N)-smooth in the sense of Definition \9.1[ Then 
the product EiE 2 is (M° d ' m < tlyl \ N)-smooth. 

Proof. First of all we have, for all n G N, 
By the triangle inequality we have 

d(e 1 e 2 (n - e i ),e 1 e 2 (n)) sC d{e 1 {n - ei)e 2 {n - e*j), £i(n)e 2 (n - e*)) 

+ d(ei(n)e 2 (n - ei),E l (n)E 2 {n)). 

Using the fact that d(ei(n), id©), d(e2(n), idc) ^ Q for all n G [N], the result now follows 
immediately from the right-invariance of d, Lemma IA.5I and Lemma IA.4I □ 

We can now state and prove the multiparameter version of Theorem 11.191 that we 
need. 

Theorem 10.2 (Multiparameter factorization theorem). Let s,m,t ^ 0, let M ^ 2 
and A > 0, and let N\, . . . , N t ^ 1 and d ^ 0. Suppose that G/Y is an m- dimensional 
nilmanifold with a Mo-rational Mal'cev basis X adapted to some filtration G, of degree 
d, and that g G poly(Z*, G % ). Then there is a some M, M < M < M ° A ' m ' d(1) , a 
subgroup G'CG which is M -rational with respect to X and a decomposition g = eg ,r y 
into sequences e, g', 7 G poly(Z*, G.) with the following properties: 
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(i) e is (M, N) -smooth; 

(ii) g' takes values in G' and with respect to the restriction of the metric d the orbit 
(g'(n)r')fi e p lX ... x p t is 1/ ' M A -equidistributed in G'/V, for any subprogressions 
P { C [Ni] with \P t \ > N l /M A ; 

(iii) 7 is a M -rational. 

Proof. Let l/M A = 5\ > 5 2 > ■ ■ ■ be a sequence of parameters to be specified as 
the proof unfolds. For each i — 1, . . . , t let Pi C [jV*] be a progression of size at least 
5\Ni. From Proposition 19.21 we know that either (g(n))fi e p lX ... x p t is <5i-equidistributed 
on G/T, or else there is a factorization 

9 = £\9il\ 

where £i,<7i,7i G poly(Z*, G.), g\ takes values in some 0(5 1 ~ 0(1) )-rational proper sub- 
group G' C G, £l is (0(5i iV)-smooth and 71 is 0(5 X ° (1) )-rational. Set P := GW; 
we are now going to look at the distribution properties of {g{n)) inside G'/V by applying 
Proposition 19.21 once more. 

To do this we choose an M^ A ' d ' m -rational Mal'cev basis X' for G'/V adapted to 
the filtration G' m := G, R G'. This is possible by Lemma [A. 10 \ and we may furthermore 

ensure that each of the basis elements X[ is an M^ A ' d ' m ^-rational combination of the 
Xi . In view of Lemma IA.6I we have 

d'(x, y) < M° A ' dMl) d{x, y) (10.1) 

for all x,ye G'/V. 

Take S 2 := cM _c for some constants c, C depending on m, d and A. If these are 
chosen suitably, and if (gi(n))fiePix-xPt is 52-equidistributed on G'/V with respect to 
the metric d' for all progressions Pi with \Pi\ ^ ^iVj, then by (110.11) the conclusion of 
the theorem holds. If this is not the case then we apply Proposition 19.21 once again, 
obtaining a factorization g\ = E292I2 where g 2 takes values in some 0(5 2 ^°^ 1 ' ) )-rational 
proper subgroup G" C G', e 2 : Z* — >■ G" is (0(<5^°^ 1 ' ) ), iV)-smooth and 72 : Z* — ► G" is 
O (5 2 ~° ( ' 1 ^ )-rational. 

This allows us to write 

g = £2t\92l\l2- 

Now it follows from Lemma [A. 61 that e 2 : Z* — > G" is in fact (M^ x \ iV)-smooth when 
regarded as a map into G (smoothness now being measured with respect to the metric 
d). By Lemma [Mil £ 2 ^i : Z* G is also (M O(1) , iV)-smooth. By Lemma EH (v) , 
7x72 : Z* — > G is )-rational. Thus, taking £ := £261, 7 := 7172 and g' := g 2 , 

the conclusion of the theorem holds unless (g 2 (n))fi e p lX ... x p t fails to be equidistributed 
on G"/T". We now proceed as before, introducing a Mal'cev basis X" and encoding 
this lack of equidistribution as the failure of (g 2 (n))HePix-xP t to be 5 3 -equidistributed 
relative to the metric d" = dx» for some 5 3 = cMq C (the constants c, C are, of course, 
not the same as before). We may then apply Proposition 19.21 once more, and so on. 

It is clear that the total number of iterations is bounded by m = dim G. The implied 
constants in the 0() notation increase with each iteration, but since the total number 
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of iterations is at most m = 0(1), this does not cause a difficulty. Thus we obtain a 
proof of our main theorem. □ 

It follows from Lemma IA. 121 (or rather the multidimensional version of it) that 
(7(n)r)jj e z« is periodic in each direction in the sense that 7(7? + Qe~i)T = j(n)T for 
some Q <C M° s < m >^ . Setting t = 1, we recover Theorem 11.191 

We leave the straightforward deduction of Theorem 11.201 to the reader. 

Appendix A. Facts about coordinates and Mal'cev bases 

Let us begin this appendix by discussing coordinate systems on a connected, simply- 
connected nilpotent Lie group G of dimension m. A discrete and cocompact subgroup 
r, leading to a nilmanifold G/T, will be introduced in a little while. Let g be the Lie 
algebra of G, and let exp : g — > G and log : G — > g be the exponential and logarithm 
maps, which are both diffeomorphisms. In this appendix all implied constants are 
allowed to depend on m and s, and for notational brevity this dependence will usually 
be suppressed. The rationality parameter Q will always be assumed to be at least 2. 

Let us begin by recalling from §5] the notion of coordinates of the first and second 
kinds. 

Definition A.l (Coordinates). Let X = . . . ,X m } be a basis for g. If 

g = exp(t 1 X 1 H h t m X m ) 

then we say that (t\, . . . , t m ) are the coordinates of the first kind or exponential coordi- 
nates for g relative to the basis X. We write (ti, . . . , t m ) = ipx,e*p{g)- If 

g = exp(uiXi) . . . exp(u m X m ) 

then we say that (ui, . . . ,u m ) are the coordinates of the second kind for g relative to 
X, and we write (ui, . . . , u m ) = ipx(g)- 

From now on in this appendix (as in the main text) we will write ip := ipx an d 
"0ex P := i>x,exp- When another basis X' for some Lie algebra g' is present we shall write 
ip' := ipx> and V4 P : = ^x,exp- 

Recall that X is said to be Q-rational if all the structure constants in the relations 

k 

are rationals of height at most Q. 

The effect of a change of basis is easily understood in coordinates of the first kind 
(indeed, it merely effects a linear transformation of coordinates). Nilmanifolds, however, 
are best studied using coordinates of the second kind. It is, therefore, no surprise that 
the following lemma describing the passage between the two types of coordinate system 
is very useful. 

Lemma A. 2 (Coordinates of the first and second type), (i) Let X be a basis for g with 
the nesting property that 

[g,X 4 ] CSpan(X m ,...,X m ) (A.l) 
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for i — 1, . . . , m — 1. Then the compositions if) exp o ip -1 and if) o if)^L are both polynomial 
maps on M m with degree 0(1). If X is Q-rational then all the coefficients of these 
polynomials are rational of height at most . 

(ii) Suppose that G' C G is a closed, connected subgroup of dimension m' with asso- 
ciated Lie algebra g'Cg, Suppose X' is a basis for g' with the nesting property. Then 
if) o if;' -1 is a polynomial map from M. m to M. m and if)' o is a polynomial map from 
Q ^ m t° ^ m ■ Both of these maps have degree 0(1). If X and X' are Q-rational 
and if each element X[ of X' is a Q-linear combination of the X,- t then all coefficients 
of these polynomials are rationals of height . 

Proof, (i) Recall the Baker- Campbell-Hausdorff formula, which states that 

log(exp(X) expQO) = X + Y + \[X, Y] + [X, Y]\ - ^[Y, [X, Y]] + 

this expression being a sum of O s (l) terms, each of which is a rational number of height 
O s (l) times a commutator of order at most s involving Xs and Ys. Repeated use of 
this allows us to write exp(uiX\) . . . exp(u m X m ) in the form exp(tiXi + ■ • • + t m X m ). 
Property flA.lj) is easily seen to imply that the ti are polynomials in the Ui with the 



specific form 

k = Ui 

t 2 = u 2 + P 2 (ui) 

t2=U 3 + P 3 (U 1 ,U 2 ) 

t m = u m + P m {ui, . . . , w m _i). (-^--2) 

This establishes the claim for ip exp o ^ x . To prove the result for if) o ip~^ p we simply 
note that the relations (IA.20 are of an "upper triangular" form which is easy to invert. 
Thus the U{ are given in terms of the tj by polynomial relations of a similar upper 
triangular form. The quantitative statements follow by the same arguments, keeping 
track of the heights of the rational numbers involved. We leave the details to the reader. 

(ii) Note the decomposition 

^ o V/- 1 = (if, o o (W P o Cp 1 ) o (V4 P o 

Of the three maps here, the first one is a polynomial map from R m to R m by (i), and 
the third is a polynomial map from R m to R m . The middle map is simply a linear 
transformation from IR m to M. m . 

The composition if>' o ip~ l may be dealt with in a very similar manner. 

Once again the quantitative claims follow by the same arguments, keeping track of 
heights. We leave the details to the reader. □ 

The upper-triangular form of the relations (1A.2j) allows us to prove the following key 



result, which describes group multiplication and inversion in coordinates. 

Lemma A. 3 (Multiplication and inversion in coordinates). Let X be a basis for g with 
the nesting property (jA.ip . Let x,y G G, and suppose that if>(x) = t and if>(y) = u. 
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Then 

ip(xy) = 

{ti + m,t 2 + U 2 + Pl(tx,Ux), ,...,t m + U m + P m _i(ti, . . . ,t m _i,Mi, . . . ,U m -i)), 

where, for each i = 1, ... ,m — 1, Pj : l ! x R' — > R is a polynomial of degree 0(1). 
Furthermore 

*Jj( x - 1 ) = (-t u -t 2 + Pi(ti), . . . , -t m + P m _i(ti, . . . , t m -l)) 

where Pi : R ? — > R zs a polynomial of degree 0(1). Let Q ^ 2. If X is Q-rational then 
all the coefficients of the polynomials Pi, Pi are rationals of height Q 0<yl \ 



Proof. By (IA.2j) we know that 



^exp^) ~~ ^2 + Pl(^l), • • • , t m + P m _i(ti, . . . , t m _l)) 

and similarly for ip exp (y), where Ri : R l — > R is a polynomial for i = 1, . . . ,m — 1. 
It follows from the Baker-Campbell-Hausdorff formula and the nesting property (lA.ip 
that 

(tl + Ui,t 2 + «2 + Sl(tl,ttl), . . . ,t m + U m + S m -l(ti, . . . ,t m _i,Ui, . . .,U m -i)), 

where each Si : W x R' J — >■ R is again polynomial. The statement about the form of 
ip(xy) now follows from a further application of the relations (IA.2I) . and the statement 
about ^(a; -1 ) is an immediate corollary of it. 

To obtain the quantitative versions of these statements we use the same arguments, 
keeping track of the heights of the rational numbers involved. We leave the details to 
the reader. □ 



Recall at this point Definition I2.2[ in which a basis X is used to define metric d = dx 
on G. We defined d to be the largest metric such that d(x,y) ^ {ip^xy^ 1 )] for all 
x,y G G, where | ■ | denotes the £°°-norm on R m . For practical purposes it is important 
to have an understanding of such metrics in terms of the coordinates ip(x) and ip{y), or 
even in terms of coordinates ip'(x), ip'(y) relative to some other basis X' . The following 
lemma provides some information in this regard. Here, and in the rest of this appendix, 
we write d := dx and d! := dx>. 

Lemma A. 4 (Bounds for d in terms of coordinates). Suppose that Q ^ 2. Suppose 
that X,X' are two Q-rational bases for g, both satisfying the nesting condition (IA.1I) . 
Suppose that each X- is given by a Q-rational combination of the Xi and vice versa. 
Then for all x,y G G with \ib'(x)\, \^\y)\ ^ Q we have the bound 

d(x,y)^Q°^\^(x)-^(y)\, (A.3) 

and for all x,y G G with d(x, ido), d(y, idc) ^ Q we have the bound 

\^(x)-^(y)\<^Q°^d(x,y). (A.4) 

Proof. Inequality flA.3|) is by far the easier of the two inequalities claimed here and we 
prove it first. By definition we have d(x, y) ^ \ip{xy^ x )\. Write i/j'(x) = t and ip'iv) = u 'i 
by Lemmas IA.2I and IA.3I we see that the coordinates %l)(xy~ l ) are 

(P x (t, «),..., P m (t,u)), 
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where each Pj : R m x R m — > R is a polynomial of degree 0(1) whose coefficients are 
rationals of height Q ^. Each of these polynomials of course vanishes when t = u, and 
so we can write (e.g.) 

m 

P 1 (t,u) = P 1 (t,u)-P 1 (t,t) = J2(ti-Ui)XiA t ,u), 

1=1 

where each : R m x R m — > R is a polynomial of degree 0(1) whose coefficients are 
rationals of height Q°^. (One way to see this is to expand Pi as a sum of monomials 
t?*vr .) The bound (1A.3j) follows immediately. 

The second bound, (1A.4|) . is significantly more difficult. We begin by proving the 
special case in which X = X 1 and y = idc, or in other words the following claim: 

|-?/>(x)| <C Q°^d(x, idc) uniformly for all x with d(x, id G ) ^ Q- (A. 5) 

Write K(x,y) := min(| , 0(x|/~ 1 )|, \ip(yx~ l )\). We will use the bound 

\i>(x) - i>(y)\ « Q°^K(x,y)(l + k(x, y) + |^(y)|)° (1) . (A.6) 

To prove this when n(x,y) = |?/>(:q/ _1 )| we proceed much as in the proof of (I A. Ill) : set 
x = zy and use Lemma [A.3I to expand ip(x) — ip(y) = ip{zy) — ip(y) as a polynomial 
in the coordinates of v — ip(y) and w = ip(z) which vanishes when w — 0. When 
n(x,y) = |-^(?/x _1 )| we proceed similarly, setting x = yz^ 1 . 

From (IA.6P we see in particular that if |^(y)| ^ 1 and n(x,y) ^ 1, then 

\t/>(x)\ ^ \ij(y)\ + CQ C K(x,y) 

for some constant 0^1. Iterating this we see that if elements of G with 

xq = idc and k(xo, xi) + . . . + k(x„_i, x n ) ^ C~ X Q~ C then 

\i>(x n )\ ^ CQ c (k(x ,x 1 ) + . . . + K(x n -i,X n )). 

Inspecting the definition of d, we conclude that 

\ip(x) | < Q° {1) d(x, id G ) whenever d(x, id G ) ^ C^Q' . (A.7) 

By right-invariance and symmetry of d, we can amplify this to 

\K{x,y)\ < Q° (1) d(x,y) whenever d{x,y) ^ C^Q' . (A.8) 

The estimate (1A.7j) is almost what we need, except that the bound on d(x, id G ) is too 
strict. To relax it, we argue as follows. To obtain f lA.5j) . it suffices to show that 

\i>(xn)\ < Q° {1 \k(xq,xi) + . . . + /c(x n _i,s n )) 

whenever x , . . . , x n 6 G with x Q = id G and k(x , x\) + . . . + n(x n -i, x n ) ^ 2Q (say). 

Using a greedy algorithm, split the path (x , • • • , x n ) into 0(Q°^) paths (xj, . . . , Xj) 
with «(xj,Xj + i) + . . . + n(xj^i,Xj) ^ _1 Q~ C ', plus 0(Q ^) singleton paths (xj,Xj + i) 
with C~ X Q~ C ^ /«(xj,Xj + i) ^ 2Q. Applying (IA.8p . we thus see that there exists a path 
(y , ...,y r ) with r = 0(Q° (1) ), y = id G , and y T = x n , such that ufayi-x) < Q° (1) for 
all 1 ^ z ^ r. In particular (using Lemma [A.3j) if we write (7i := y%yl[\ for 1 $C % ^ r, 
then we see that iVKfiOl ^ <3° ■ On the other hand, we have the telescoping product 

Xn = 9r- --91- 
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Now if gi, . . . , g r G G are any elements with IV'G?*)! ^ t for all i then 

|^(^i...^)|«(l + t) 0(1) r°«. 

This may be seen by applying Lemma IA.3I repeatedly to expand the product out com- 
pletely in coordinates. That the first coordinate is polynomially controlled is obvious, 
and it then follows that the second is also, and so on inductively. Applying this in the 
present situation gives |-0(a; n )| -C Q°^\ and similar arguments for each i give that in 
fact | ^ 0^)1 ^ uniformly for ^ i ^ n. Applying flA.6h we have 

\^{ Xi )\ ^ +0(Q° {1) k(x 1 . 1 ,x 1 )) 

and (IA.5P follows. 

We have just established the special case X = X', y = id G of flA.4p . We now deal with 
the case where X = X' but y is arbitrary. Suppose then that d(x, idc), d(y, idc) ^ Q- 
Applying (IA.5|) we see that ^(z)!, \ip(y)\ <C Q°^. By Lemma IA.3I we therefore have 
I^O^T 1 )! < Q° w , and hence by flA.3p it follows that d{xy~ l , id G ) < Q° {1) . Applying 
flA. 5|) once more, we see that 

Q°Md{xy-\ id G ), 

which, since d is right-invariant, implies that 

mxy-^l^Q^d^y). (A.9) 

The claimed result now follows immediately using (I A . 6 [) . 

Finally we turn to the general case in which X and X' may be different. We start 
with the special case of (1A.4|) just proved, namely 

\ij(x)-ij(y)\<t:Q ow d(x,y). (A.10) 

Applying flA.3j) we obtain 

d'{x,y) « g° (1) |^(x) - ^{y)\ « Q° {1) d(x,y). 

In particular we have d'(x, id G ), d'(y, id G ) <C ■ A second application of (lA.lOj) . 
with X replaced by X', then gives 

W{x)-^\y)\ « Q°^d\x,y) « Q° w d(x,y). 

This concludes the proof of Lemma IA.41 □ 

The metric d is right-invariant, that is to say d(xg,yg) = d(x,y) for all x,y,g G G. 
It is useful to have, in addition, the following approximate left-invariance property. 

Lemma A. 5 (Approximate left-invariance of d). Suppose that Q ^ 2 and that X is a 
Q-rational basis for g satisfying the nesting condition (lA.lj) . Suppose that g,x,y G G 
are elements with \ip(x)\, \ip(y)\, \ip{g)\ ^ Q- Then we have the bound 

d(gx,gy) ^Q° w d(x } y). 

Proof. We start by observing that uniformly in g, z G G we have the bound 

mgzg- 1 )] « g°W(l + \^(z)\ + mg)\)°^(z)\. (A.ll) 

This follows by using Lemma IA.3I to conclude that the components of iplgzg -1 ) are 
polynomials of degree 0(1) with Q^^-rational coefficients in the coordinates v = ip(g) 
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and w = ip(z), and these polynomials all vanish when w = 0. Recall from Definition 
Othat 

n-1 

d(x,y) = inf{^min(|^(x i _ix^ 1 )|, \ijj{xiXj} x )\) : x , . . . , x n E G; x = x; x n = y}. 

i=0 

(A.12) 

We see, then, that the lemma will follow from (lA.lip (taking z = XiX { \ or x^ix i 1 ) if 
we can show that the infimum may be taken over all those Xi,Xi-± which satisfy some 
bound mm(\^p(xi-iX~ 1 )\,\-ip(xiX^\)\) <C . But this follows from the inequality 

d(x, y) <C , which is an instant consequence of Lemma IA.41 □ 

We conclude this subsection by recording the following result. 

Lemma A. 6 (Comparison lemma). Suppose that G' C G is a closed subgroup and that 
X, X' are bases for q, g' respectively which have the nesting property OA.ip . Let Q ^ 2, 
and suppose that each X[ is a Q-rational combination of the Aj. Then we have the 
bounds 

d'(x,y)<^Q ow d(x,y) 
uniformly for all x,y E G' with ^(a;)!, \i>{y)\ ^ Q and 

d(x } y)^Q° (1) d'(x,y) 
uniformly for all x,y E G' with \i])'(x)\, \ij)'{y)\ ^ Q. 

Proof. We follow essentially the same argument used in the previous lemma. To 
prove the first bound, for example, replace (lA.lip with the bound 

|^)I«Q 0(1) (1 + I^)I) 0(1) I^)I- 

This follows immediately from Lemma [A. 21 (ii), which guarantees that ip'(z) is a poly- 
nomial in the coordinates ip(z) which vanishes when ip{z) =0. □ 

Mal'cev bases. Suppose that G is a connected, simply-connected nilpotent Lie 
group with a filtration G,. Let us now introduce a discrete and cocompact subgroup 
T to the discussion. Throughout the paper we have assumed that G/Y comes together 
with a special type of basis X called a Mal'cev basis adapted to G 9 , which is invoked 
whenever it is necessary to discuss the metric structure of G/Y. 

Let us recall from §2] the basic properties of these bases: 

(i) For each j = 0, ... ,m — 1 the subspace f)j := Span(Xj + i, . . . ,X m ) is a Lie 
algebra ideal in g, and hence Hj := exp t)j is a normal Lie subgroup of G. 

(ii) For every i, ^ i ^ s, we have Gi = H m __ dim ( G .^ (or equivalently, Qi = 

fym-dim(gi))] 

(iii) Each g E G can be written uniquely as exp(t 1 A 1 ) . . . exp(t m X m ), for t\, . . . , t m E 
R. 

(iv) r consists precisely of those elements which, when written in the above form, 
have all ii, . . . , t m E Z. 

Mal'cev bases are not especially flexible in certain ways - for example it is not at all 
easy to take a Mal'cev basis on G/Y and use it to construct one on G n /Y n as we had to 
do in the proof of Lemma YTM For additional flexibility it is convenient to introduce the 
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notion of a weak basis for G/Y. These are only ever used in the process of constructing 
actual Mal'cev bases with desirable properties. 

Definition A. 7 (Weak bases). Let X = {X 1 , . . . ,X m } be a basis for g. Let Q ^ 2 be 
a parameter. We say that X is a Q-rational weak basis for G/Y if X is Q-rational (cf. 
Definition 12. 4p and if we have ^Z m D ip cxp (Y) D qL m for some q ^ Q, that is to say the 
coordinates of log Y relative to X are close to being integers. 

Note carefully that logT is not necessarily a subgroup of g, as we saw in $5] in 
connection with the Heisenberg example. 

We record some simple facts about weak bases. 

Lemma A. 8 (Weak bases: simple facts). Weak bases enjoy the following properties. 

(i) Suppose that X is a Q-rational weak basis for G /Y , and that X' = {X[, . . . , X^} 
is another basis for q with the property that each X- is a Q-rational combination 
of the Xj. Then X' is a Q°^ -rational weak basis for G/Y. 

(ii) Suppose that X is a Mal'cev basis adapted to some subgroup sequence G,, that 
is to say conditions (i), (ii), (iii) and (iv) from the start of the section are 
satisfied. Suppose that X is Q-rational. Then X is an 0(Q°^) -rational weak 
basis for G/Y. 

Proof. Part (i) is immediate. Part (ii) follows quickly from Lemma IA.2I □ 

The next proposition allows us to construct Mal'cev bases from weak bases. If X is a 
Mal'cev basis for G/Y and if G' C G is a subgroup, we say that G' is Q-rational if the 
Lie algebra g' is generated by Q-rational combinations of the basis elements Xj. 

Proposition A. 9 (Construction of Mal'cev bases). Suppose that X is a Q-rational 
weak basis for G/Y and that G, is a filtration in which each subgroup Gi is Q-rational. 
Then there is a Mal'cev basis X' = {X[, . . . , X' m } for G/Y adapted to G, in which each 
X[ is a Q°^ -rational combination of the basis elements Xj. In particular, the Mal'cev 
basis X' is Q°^ -rational. 

Proof. Take a basis for g^ consisting of Q-rational linear combinations of the Xj. By 
straightforward linear algebra this may be extended to a basis of Qd-i consisting of Q°^- 
rational combinations of the Xj. This in turn may be extended to a basis of Qd-2 and so 
on. In this fashion we obtain a basis y = {Yi, . . . , Y m } for g as a vector space consisting 
of Q° < - 1 ^-rational combinations of the Xj such that each g« equals Span(Y}+i, . . . ,Y m ) 
where j = m — mj. By Lemma [A. 8 1 (i) we see that y is a Q^^-rational weak basis for 
G/Y. 

Since [g, g$] C g i+1 for all i we see that the weak basis y enjoys the nesting property, 
that is to say [g, Yf\ C Span(Y, + i, . . . , Y m ) for all j. 

We now convert this basis 3^ into the desired Mal'cev basis by choosing X' m = 
c m Y m , . . . , X[ — c{Yi in turn so that 

Span(Y m , . . . , Y m ) n Y = {exp(n m X- +1 ) . . . exp(n m X^J GZ} (A.13) 

for i = m — 1, . . . , 0. Such a basis X' has all of the properties (i), (ii), (iii) and (iv) 
required to qualify as a Mal'cev basis. Suppose this is done for i — j. Since y is a 
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Q°( 1 )-rational weak basis for G/Y we see that 

( Span(F i , . . . , Y m ) fir)/ Span(Y,-+i, . . . , Y m ) 

is generated by exp(cjYj) for some Cj G Q with heights bounded by Q°^ . Taking 
Xj := CjYj, we see that (IA.13I) holds for i = j — 1 too. □ 

For applications (for example in the proof of Lemma I7.4p it is convenient to have the 
following variant of the above proposition. 

Proposition A. 10 (Mal'cev bases of subnilmanifolds) . Suppose that X = {Xi, . . . , X m } 
is a Q -rational Mal'cev basis for G/Y adapted to a filtration G,. Suppose that G' C G 
is a Q-rational subgroup of G, and furthermore that G' 9 is a filtration on G' in which 
each of the groups G\ is Q-rational (with respect to the basis X). Write Y' :— Y n G' . 
Then G' /Y' has a Mal'cev basis X' = {X[, . . . ,X' m ,} adapted to G', in which each X[ is 
a -rational combination of the X i . 

Proof. One simply observes that by linear algebra there is a basis 3^ = {Yi, • • • , Y m >} 
for g' together with an extension y = {Yi, . . . , Y m } to a basis for q such that each of the 
Yi is a Q^^-rational combination of the Xj. By Lemma [A. 8[ y is a weak basis for G/Y, 
and therefore y is a weak basis for G'/Y'. The result now follows from Proposition IA. 91 
applied to this weak basis. □ 

Rationality. We now record some simple results about rational points in nilmani- 
folds G/Y. Recall Definition 11.111 g G G is rational if g r G Y for some integer r > 0. 
Recall also the quantitative version of this, Definition 1 1.171 g G G is Q-rational if g r G Y 
for some integer r, < r ^ Q. 

Lemma A. 11 (Properties of rational points). Suppose that X is a Q-rational Mal'cev 
basis adapted to some subgroup sequence G,, where Q ^ 2. 

(i) If 7 G G, then 7 is rational if and only ifipi^y) G Q m . 

(ii) The set of rational points in G is a group. 

(hi) If 7 G G is Q-rational, then ip(l) G ^?^ m for some Q', 1 ^ Q' < Q°^\ which 
does not depend on 7. 

(iv) If ' 'j G G is such that ^(7) G ^^ m , then 7 is O (Q°^) -rational. 

(v) 7/7,7' are Q-rational, then 77' and are 0(Q°^) -rational. 

Proof. If 7 is rational, then by definition there exists r ^ 1 such that Y G T, and 
thus ip(j n ) G Z m whenever n is a multiple of r. Now from Lemma 16.71 we know that 
the coordinates ip(g n ) are all polynomials of degree 0(1); these vanish at zero, and take 
integer values at multiples of r. By the Lagrange interpolation formula we conclude 
that all the coefficients of these polynomials are rational, and so in particular we have 
V( 7 )GQ m . 

Suppose conversely that ^(7) G Q m . Then by Lemma [A.3I we see that each of i/j^ 2 ), 
-0(7 3 ), . . . also lies in Q m . By another application of Lemma 16.71 and the Lagrange 
interpolation formula we conclude that each coordinate of ip(j n ) is a polynomial with 
rational coefficients which vanishes at zero. In particular it is easy to see that by 
choosing r G N suitably we may ensure that ^(Y) ^"N which of course implies that 

7 r g r. 
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Part (ii) follows immediately from (i) and Lemma [A. 3 1 

Claims (iii)-(v) follow by repeating the above arguments, but keeping track of the 
heights of all the rational numbers involved; the key point is that the group operations, 
as well as Lagrange interpolation, are all polynomial in nature and so all heights will 
be 0(Q ^'). We omit the routine details. □ 

Let us now recall the notion of a rational sequence, also given in Definition 11.111 A 
sequence 7 : Z — > G is rational if 7(^)T is rational for all n, and it is Q-rational if 
7(n)r is rational for all n. The next lemma records some useful properties of rational 
polynomial sequences. 

Lemma A. 12 (Properties of rational polynomial sequences). Suppose that 7 : Z — > G 
is a polynomial sequence of degree d. 

(i) Suppose that 7 is rational. Then 7(n)T is periodic. 

(ii) Suppose that there is a Q -rational Mal'cev basis X for G/Y and that 7 is Q- 
rational. Then j(n)Y is periodic with period <C Q°^ . 

Proof, (i). Let X be any Mal'cev basis for G/Y. By Lemma 16.71 the coordinates 
if)( / -f(n)) are all polynomials of degree 0(1), and by the previous lemma and the Lagrange 
interpolation formula they all have rational coefficients. Clearing denominators, we thus 
find some q such that ip(^(n)) G ^Z m for all integers n. By Lemma IA.3I we see that 
there is some q' G N such that, for any r G Z, we have ip{^{n + r)7(n) _1 ) G ^-Z" 1 . 
Thus 7(n)T is indeed periodic, with period qq'. 

Part (ii) is proved in exactly the same way, once again taking care to keep track of 
the heights of all rationals involved. □ 

We leave the formulation and proof of the multidimensional version of this lemma 
(that is, concerning maps 7 : Z* — > G) to the reader; only trivial modifications are 
required. 

The next result, stating that conjugates of rational subgroups by rational elements 
are rational, is not needed in the present paper. It is required in the companion paper 

m. 

Lemma A. 13 (Rational conjugates). Suppose that X = {Xi, . . . ,X m } is a Q-rational 
Mal'cev basis for G/Y adapted to some filtration. Suppose that 7 G G is Q-rational and 
additionally that the coordinates are a ^ bounded in magnitude by Q. Suppose that 
G' C G is a Q-rational subgroup. Then the conjugate ^G'^~ l is Q ^ -rational. 

Proof. Set H := 7C7 -1 and let f) be the corresponding Lie algebra. Recall from 
basic Lie theory the identity 

log(7exp(X) 7 - 1 )=Ad(7)X, 

where Ad(7) : Q — » g is the adjoint automorphism of g associated to the element 7 G G. 
For the purposes of this argument all we need is the following immediate consequence 
of this identity: if X[, . . . , X' m , is a basis for the Lie algebra q' then the elements 

A > J :=log(7exp(X;) 7 - 1 ) 
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are a basis for f). By assumption we may choose the X[ to be Q-rational combinations 
of the Xj. It then follows from Lemmas IA.2I and I A. 3 1 that each is a Q^^-rational 
combination of the Xj. 

Fundamental domain and reduction. The next lemma provides a description 
of G/Y in terms of coordinates relative to any Mal'cev basis X. 

Lemma A. 14 (Reducing to the fundamental domain). Let X be a Mal'cev basis adapted 
to some subgroup sequence G,. Suppose that g G G. Then we may write g = {g}[g} in 
a unique way, where ip({g}) G [0, l) m and [g] G Y. 



Proof. Recall Lemma \A.3\ which describes the multiplication on G in coordinates 
relative to X. Using this we may iteratively construct j m ,j m -i, ... ,71 G T in such a 
way that coordinates i + 1, . . . , m of V ; (5 , 7m • • • 7i) all lie in the interval [0, 1). 

The uniqueness also follows easily from Lemma IA.31 if ^(27), ip(x) G [0, l) m then we 
may equate coefficients of ^(7) starting at the right to deduce that 7 = idc- □ 

Metrics on nilmanifolds. Let X be a Mal'cev basis for some nilmanifold G/Y. 
Recall from Definition 12.21 the manner in which we used the metric d = dx on G to 
define a "metric" on G/Y via 

d(xY, yY) = inf dlx^f, 2/7') • 
7,7'er 

We can now prove that d really is a metric on G/Y (and thus the inverted commas 
above can be dispensed with). 

Lemma A. 15 (Nondegeneracy of metric). Suppose that X is a rational Mal'cev basis 
for a nilmanifold G/Y, adapted to some filtration. Suppose that d(xY,yY) = 0. Then 
x = y(modr). 

Proof. Since the metric d on G is right-invariant we have 

d(xY,yY) = inf d(x,yy). 

It suffices to show that the inf here is a actually a minimum, to which end we need 
only show that for any M there are just finitely many 7 G Y with d(x, yy) ^ M. By 
Lemma (1A.5j) this assumption implies that d(y~ 1 x ) r f) ^ M', for some M' depending on 
M, the rationality of the Mal'cev basis X and the size of the coordinates of x and y. 
This in turn implies that d(idc, 7) ^ M" which, in view of Lemma IA.4} implies that 
|^(7)| ^ M'". But if 7 G T then the coordinates ^(7) are all integers, so the result 
follows. □ 

Lemma A. 16 (Nilmanifolds are bounded). Let Q ^ 2, and suppose that X is a Q- 
rational Mal'cev basis for a nilmanifold G/Y (with respect to some filtration). Then 
d(xY,yY) Q 0( -^ uniformly in x,y G G. 

Proof. By Lemma [A. 141 we may choose 7 and 7' so that ^(a^)!, \ip(yj')\ ^ 1. The 
claim now follows immediately from Lemma IA.4I □ 

The final result of this appendix is not used in this paper but is required in §2 of the 
companion paper [T3] . 
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Lemma A. 17 (Comparison of metrics on nilmanifolds) . Let Q ^ 2. Suppose that 
G' C G is a closed subgroup and that X, X' are Q-rational Mal'cev bases for G/Y and 
G'/Y' respectively such that each X[ is a Q-rational combination of the Let d,d' be 
the metrics induced on G/Y and G'/Y' respectively. Then for any x,y G G' we have 

d'{xY',yY') < Q°^d(xY,yY) 

and 

d(xY,yY) <^Q° {l) d'(xY\yY'). 

Proof. We prove the second inequality first. By the proof of Lemma IA. 151 there is 
some 7' G Y' such that d'(xY',yY') = d'(x,yy'). Here we may assume, using Lemma 
IA.144 that \ip'(y)\ ^ 1. By Lemma lA.16l we have d'(x. yy') ^ Q°^ l \ and therefore 

by Lemma IA.4I and the triangle inequality we have d' (idc , yy') "C ■ By a second 
application of Lemma IA.4I it follows that |0'(y7')| *C ■ By Lemma rA.6l we therefore 
have d(x,yy') <C Q° ( - 1 ^d'(x,y'j'). Since Y' C Y, this implies that 

d(xT,yT) ^ d{x, yi ') < Q°^d'(x,yj') = Q°^d'(xY' ,yF), 

which is the second inequality claimed. 

To prove the first inequality we make the same initial manoeuvres. That is, we may 
assume that l"0(2/)l ^ 1 an d that there is some 7 G Y such that d(xY,yY) = 

d(x,yy). Let C be a constant to be specified later. If d(x,yy) ^ Q~ c then, by Lemma 
IA.16[ the bound is trivial. Suppose, then, that d(x, yy) < Q~ c . This is an assertion to 
the effect that 7 lies "near" G'. We will use the rationality properties of the coordinates 
of T to conclude from this that 7 must actually lie in G'. 

By Lemma IA.5I and Lemma IA.3I we obtain d(z, 7) <C Q°^~ c , where z := y~ l x. 
Since d(z, idc) <C Q ot ^' we have 0^(7, idc) <C Q°^\ and so by Lemma TA.4I it follows 
that \if)(z) - '0(7)| < Q ^ . It follows from this and Lemma \K2\ that 

|0exp(z) - ^exp(7)l « Q° (1) ~ C . (A.14) 

Now G' is defined, in exponential or type I coordinates, as the intersection of the kernels 
of 0(1) linear forms with rational coefficients of height 0(Q°^). The coordinates "0(7) 
are integers and so the type I coordinates eX p(7) are, by Lemma |A~2| rationals of height 
0(Q°W). The element z, of course, lies in G'. If C is chosen sufficiently large, it follows 
from these observations and flA. 14|) that indeed 7 lies in G' and hence in Y'. 

We now have that d{x,yy') <C Q 0< ^\ where 7' = 7 lies in Y' . One final application 
of Lemma [A.6I implies that d'(x,yy') <C Q° ( - 1 ^d(x,yy'), from which it of course follows 
that 

d'(xY',yY') ^ d'{x,yi) < Q° W d(x, yi) = Q°« 'd(xY \ yY) . 
This concludes the proof. □ 
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